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BEFORE YOU START 


This is the first volume of ULTRIX-32 Supplementary Documents, a three volume set that 
contains articles describing the ULTRIX-32 system. The authors are computer scientists and 
program developers at Bell Laboratories and the University of California at Berkeley. The 
articles explain the software tools and utilities available on your ULTRIX-32 systemi. They 
constitute most of the lore that enriches this operating system; topics range from getting 
started procedures to the details of screen updating and cursor movement facilities. 


Each volume in this set contains several parts, and each part begins with an introduction. 
Each introduction serves as a map that will help you find your way around in the documenta- 
tion, allowing you to select articles that relate to your interest. Each introduction gives an 
overview of the material covered in the part and a description of the articles included. Most 
readers will not need to read all articles in any part, since many articles cover parallel topics. 
For example, Part 3 in this first volume contains articles describing several text editors. You 
should be able to choose one editor after reading the introduction; then you can proceed to 
the relevant article. 


These articles provide authoritative and accurate information that is unavailable elsewhere. 
However, you should be aware that some of the information in some articles is dated. We 
include those articles because many of the concepts they develop are still current and impor- 
tant. 


At the end of each volume in this set, you will find a master index identifying topics for all 
three volumes. 


Topics in Volume I 


This first volume contains articles written for general use. You should find many of the arti- 
cles helpful no matter how you plan to use your ULTRIX-32 system. The two articles in Part 
1 introduce the entire three-volume set; however, readers who are unfamiliar with operating 
systems and programming and readers new to the ULTRIX-32 and UNIX systems should 
begin with Part 2, Getting Started. The articles introduce basic concepts and demonstrate 
simple procedures. 


You will need to use a text editor if you plan to write (create or modify) files. Part 3, Text 
Editors, gives comprehensive information on five editors: ed, edit, vi, ex, and sed. 


Articles in Part 4, Command Interpreters, introduce the two shells provided with the 
ULTRIX-32 system: the Bourne Shell and the C Shell. Each shell serves as a set of handles 
that gives the user access to the ULTRIX-32 utilities. 


If you intend to use your ULTRIX-32 system to write and format any kind of document, you 
will find the articles on Document Preparation in Part 5 essential. Nroff and troff are text 
formatting utilities. In addition, the ULTRIX-32 software includes separate utilities that 
cooperate with the formatters to help you typeset mathematical expressions, set up tables, and 
create bibliographical references in your text. 


Part 6 includes articles that tell about a variety of unsupported software. 
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PART 1: OVERVIEW 


The first two articles in this volume introduce the entire three-volume set of ULTRIX Sup- 
plementary Documents. The article entitled “UNIX/32V — Summary” lists features of the 
UNIX system released in March 1979. ULTRIX-32 is based on the Berkeley 4.2BSD distribu- 
tion, which is in turn based on Bell Laboratories UNIX 32V and the UNIX 7th Edition. 


The second article, “The UNIX Time-Sharing System,” by Ritchie and Thompson, provides 
an overview and history of UNIX. The authors are the original developers of this software 
system. This article is suitable for readers who are familiar with computer software and 
operating systems. Although it describes UNIX as it was implemented in 1974, the article 
remains an important part of the UNIX documentation. With the exception of some details, 
it gives an accurate account of many of the concepts and features of ULTRIX-32. The 
authors convey the spirit of UNIX and ULTRIX-32, though the article includes some infor- 
mation that is no longer current. 
“The UNIX Time-Sharing System” explains these notable features of UNIX: 

e A pipe enables related processes to pass information between the related processes. 

¢ A filter takes its input from one process and delivers its output to another process. 

e A shell serves as a user interface to the system. 

e An image is a computer execution environment. 

¢ A process is the execution of an image. 

e A process may create another process. The creating process is the parent; the created 

process is the child. 

The article also tells how to: 

e Execute procedures in background, leaving your terminal free to perform other func- 

tions while the background procedures run. 

e Create user interfaces that serve as alternatives to the shells. 

e Set up restricted environments for some users. 

e Detect and deal with hardware and software errors. 
Be sure to read the last part of “The UNIX Time-Sharing System” if you want to know about 
the early stages of UNIX development. Ritchie and Thompson explain their original goals 


and design considerations, and they identify important steps in the evolution of the software 
system that forms the basis of ULTRIX-32. 
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UNIX/32V — Summary 


March 9, 1979 


A. What’s new: highlights of the UNIX1t/32V System 


32-bit world. UNIX/32V handles 32-bit addresses and 32-bit data. Devices are addressable 
to 2°! bytes, files to 230 bytes. 


Portability. Code of the operating system and most utilities has been extensively revised to 
minimize its dependence on particular hardware. UNIX/32V is highly compatible with UNIX 
version 7, 


Fortran 77. F77 compiler for the new standard language is compatible with C at the object 
level. A Fortran structurer, STRUCT, converts old, ugly Fortran into RATFOR, a structured 
dialect usable with F77. 


Shell. Completely new SH program supports string variables, trap handling, structured pro- 
gramming, user profiles, settable search path, multilevel file name generation, etc. 


Document preparation. TROFF phototypesetter utility is standard. NROFF (for termi- 
nals) is‘now highly compatible with TROFF. MS macro package provides canned commands 
for many common formatting and layout situations. TBL provides an easy to learn language 
for preparing complicated tabular material. REFER fills in bibliographic citations from a data 
base. 


UNIX-to-UNIX file copy. UUCP performs spooled file transfers between any two 
machines. 

Data processing. SED stream editor does multiple editing functions in parallel on a data 
stream of indefinite length. AWK report generator does free-field pattern selection and arith- 
metic operations. 

Program development. MAKE controls re-creation of complicated software, arranging for 
minimal recompilation. 


Debugging. ADB does postmortem and breakpoint debugging. 


C language. The language now supports definable data types, generalized initialization, 
block structure, long integers, unions, explicit type conversions. The LINT verifier does 
strong type checking and detection of probable errors and portability problems even across 
separately compiled functions. 


Lexical analyzer generator. LEX converts specification of regular expressions and 
semantic actions into a recognizing subroutine. Analogous to YACC. 


Graphics. Simple graph-drawing utility, graphic subroutines, and generalized plotting filters 
adapted to various devices are now standard. 


Standard input-output package. Highly efficient buffered stream I/O is integrated with 
formatted input and output. . 


Other. The operating system and utilities have been enhanced and freed of restrictions in 
many other ways too numerous to relate. 


+ UNIX is a Trademark of Bell Laboratories. 
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B. Hardware 


The UNIX/32V operating system runs on a DEC VAX-11/780* with at least the following 
equipment: 


memory: 256K bytes or more. 

disk: RP06, RMO3, or equivalent. 

tape: any 9-track MASSBUS-compatible tape drive. 
The following equipment is strongly recommended: 

communications controller such as DZ11 or DL11. 

full duplex 96-character ASCII terminals. 

extra disk for system backup. 


The system is normally distributed on 9-track tape. The minimum memory and disk space 
specified is enough to run and maintain UNIX/32V, and to keep all source on line. More 
memory will be needed to handle a large number of users, big data bases, diversified comple- 
ments of devices, or large programs. The resident code occupies 40-55K bytes depending on 
configuration; system data also occupies 30-55K bytes. 


C. Software 


Most of the programs available as UNIX/32V commands are listed. Source code and 
printed manuals are distributed for all of the listed software except games. Almost all of the 
code is written in C. Commands are self-contained and do not require extra setup informa- 
tion, unless specifically noted as “interactive.” Interactive programs can be made to run from 
a prepared script simply by redirecting input. Most programs intended for interactive use 
(e.g., the editor) allow for an escape to command level (the Shell). Most file processing com- 
mands can also go from standard input to standard output (“filters”). The piping facility of 
the Shell may be used to connect such filters directly to the input or output of other pro- 
grams. 


1. Basic Software 


This includes the time-sharing operating system with utilities, and a compiler for the 
programming language C—enough software to write and run new applications and to maintain 
or modify UNIX/32V itself. 


1.1. Operating System 


O UNIX The basic resident code on which everything else depends. Supports the sys- 
tem calls, and maintains the file system. A general description of UNIX design 
philosophy and system facilities appeared in the Communications of the ACM, 
July, 1974. A more extensive survey is in the Bell System Technical Journal 
for July-August 1978. Capabilities include: 

O Reentrant code for user processes. 

O“Group” access permissions for cooperative projects, with overlapping 
memberships. 

O Alarm-clock timeouts. 

O Timer-interrupt sampling and interprocess monitoring for debugging and 
measurement. 

© Multiplexed I/O for machine-to-machine communication. 


O DEVICES ~ All I/O is logically synchronous. I/O devices are simply files in the file system. 
Normally, invisible buffering makes all physical record structure and device 
characteristics transparent and exploits the hardware’s ability to do 


*VAX is a Trademark of Digital Equipment Corporation. 


O BOOT 
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overlapped I/O. Unbuffered physical record I/O is available for unusual appli- 
cations. Drivers for these devices are available: 
O Asynchronous interfaces: DZ11, DL11. Support for most common ASCII 
terminals. 
© Automatic calling unit interface: DN11. 
O Printer/plotter: Versatek. 
O Magnetic tape: TE16. 
O Pack type disk: RP06, RM03; minimum-latency seek scheduling. 
O Physical memory of VAX-11, or mapped memory in resident system. 
O Null device. 
O Recipies are supplied to aid the construction of drivers for: 
Asynchronous interface: DH11. 
Synchronous interface: DU11. 
DECtape: TC11. 
Fixed head disk: RS11, RS03 and RS04. 
Cartridge-type disk: RK05. 
Phototypesetter: Graphic Systems System/1 through DR1I1C. 


Procedures to get UNIX/32V started. 


1.2. User Access Control 


O LOGIN 


O PASSWD 


O NEWGRP 


Sign on as a new user. 

O Verify password and establish user’s individual and group (project) identity. 
© Adapt to characteristics of terminal. 

O Establish working directory. 

© Announce presence of mail (from MAIL). 

O Publish message of the day. 

O Execute user-specified profile. 

O Start command interpreter or other initial program. 


Change a password. 
O User can change his own password. 
O Passwords are kept encrypted for security. 


Change working group (project). Protects against unauthorized changes to 
projects. 


1.3. Terminal Handling 


O TABS 
OSTTY 


Set tab stops appropriately for specified terminal type. 


Set up options for optimal control of a terminal. In so far as they are deduci- 
ble from the input, these options are set automatically by LOGIN. 

O Half vs. full duplex. 

O Carriage return+line feed vs. newline. 

O Interpretation of tabs. 

O Parity. 

© Mapping of upper case to lower. 

O Raw vs. edited input. 

O Delays for tabs, newlines and carriage returns. 


1.4. File Manipulation 


0 CAT 


Concatenate one or more files onto standard output. Particularly used for 
unadorned printing, for inserting data into a pipeline, and for buffering output 
that comes in dribs and drabs. Works on any file regardless of contents. 
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OCP 


O PR 


O LPR 


O CMP 
O TAIL 


0 SPLIT 


O DD 


Oo SUM 


Copy one file to another, or a set of files to a directory. Works on any file 
regardless of contents. 


Print files with title, date, and page number on every page. 
O Multicolumn output. 
O Parallel column merge of several files. 


Off-line print. Spools arbitrary files to the line printer. 
Compare two files and report if different. 


Print last n lines of input 
O May print last n characters, or from n lines or characters to end. 


Split a large file into more manageable pieces. Occasionally necessary for edit- 
ing (ED). 


Physical file format translator, for exchanging data with foreign systems, espe- 
cially IBM 370’s. 


Sum the words of a file. 


1.5. Manipulation of Directories and File Names 


O RM 


OLN 

O MV 

O CHMOD 
1 CHOWN 
0 CHGRP 
O MKDIR 
O RMDIR 
OCD 

O FIND 


Remove a file. Only the name goes away if any other names are linked to the 
file. 

O Step through a directory deleting files interactively. 

O Delete entire directory hierarchies. 


“Link” another name (alias) to an existing file. 

Move a file or files. Used for renaming files. 

Change permissions on one or more files. Executable by files’ owner. 
Change owner of one or more files. 

Change group (project) to which a file belongs. 

Make a new directory. 

Remove a directory. 

Change working directory. 


Prowl the directory hierarchy finding every file that meets specified criteria. 
O Criteria include: 

name matches a given pattern, 

creation date in given range, 

date of last use in given range, 

given permissions, 

given owner, 

given special file characteristics, 

boolean combinations of above. 
© Any directory may be considered to be the root. 
O Perform specified command on each file found. 


1.6. Running of Programs 


OSH 


The Shell, or command language interpreter. 
O Supply arguments to and run any executable program. 
O Redirect standard input, standard output, and standard error files. 


0 TEST 


O EXPR 


O WAIT 
O READ 
O ECHO 


O SLEEP 
O NOHUP 
O NICE 

O KILL 

0 CRON 


O AT 
O TEE 
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O Pipes: simultaneous execution with output of one process connected to the 
input of another. 
© Compose compound commands using: 
if ... then ... else, 
case switches, 
while loops, 
for loops over lists, 
break, continue and exit, 
parentheses for grouping. 
O Initiate background processes. 
O Perform Shell programs, i.e., command scripts with substitutable arguments. 
O Construct argument lists from all file names satisfying specified patterns. 
O Take special action on traps and interrupts. 
O User-settable search path for finding commands. 
O Executes user-settable profile upon login. 
O Optionally announces presence of mail as it arrives. 
O Provides variables and parameters with default setting. 


Tests for use in Shell conditionals. 

O String comparison. 

O File nature and accessibility. 

O Boolean combinations of the above. 


String computations for calculating command arguments. 
O Integer arithmetic 
O Pattern matching 


Wait for termination of asynchronously running processes. 
Read a line from terminal, for interactive Shell procedure. 


Print remainder of command line. Useful for diagnostics or prompts in Shell 
programs, or for inserting data into a pipeline. 


Suspend execution for a specified time. 

Run a command immune to hanging up the terminal. 
Run a command in low (or high) priority. 

Terminate named processes. 


Schedule regular actions at specified times. 

O Actions are arbitrary programs. 

O Times are conjunctions of month, day of month, day of week, hour and 
minute. Ranges are specifiable for each. 


Schedule a one-shot action for an arbitrary time. 


Pass data between processes and divert a copy into one or more files. 


1.7. Status Inquiries 


OLS 


O FILE 


List the names of one, several, or all files in one or more directories. 

O Alphabetic or temporal sorting, up or down. 

O Optional information: size, owner, group, date last modified, date last 
accessed, permissions, i-node number. 


Try to determine what kind of information is in a file by consulting the file 
system index and by reading the file itself. 
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O DATE 


O DF 

O DU 

O QUOT 
O WHO 


OPS 


O IOSTAT 


O TTY 
O PWD 


Print today’s date and time. Has considerable knowledge of calendric and 
horological peculiarities. 
O May set UNIX/32V’s idea of date and time. 


Report amount of free space on file system devices. 
Print a summary of total space occupied by all files in a hierarchy. 
Print summary of file space usage by user id. 


Tell who’s on the system. 
O List of presently logged in users, ports and times on. 
O Optional history of all logins and logouts. 


Report on active processes. 

O List your own or everybody’s processes. 

O Tell what commands are being executed. 

O Optional status information: state and scheduling info, priority, attached 
terminal, what it’s waiting for, size. 


Print statistics about system I/O activity. 
Print name of your terminal. 


Print name of your working directory. 


1.8. Backup and Maintenance 


O MOUNT 


0 UMOUNT 


O MKFS 
CO MKNOD 


OTP 
O TAR 


O DUMP 


O RESTOR 
OSU 


O DCHECK 
O ICHECK 
O NCHECK 


Attach a device containing a file system to the tree of directories. Protects 
against nonsense arrangements. 


Remove the file system contained on a device from the tree of directories. 
Protects against removing a busy device. 


Make a new file system on a device. 


Make an i-node (file system entry) for a special file. Special files are physical 
devices, virtual devices, physical memory, etc. 


Manage file archives on magnetic tape or DECtape. TAR is newer. 
O Collect files into an archive. 

O Update DECtape archive by date. 

O Replace or delete DECtape files. 

O Print table of contents. 

O Retrieve from archive. 


Dump the file system stored on a specified device, selectively by date, or 
indiscriminately. 


Restore a dumped file system, or selectively retrieve parts thereof. 


Temporarily become the super user with all the rights and privileges thereof. 
Requires a password. 


Check consistency of file system. 
O Print gross statistics: number of files, number of directories, number of spe- 
cial files, space used, space free. 


O CLRI 


O SYNC 
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O Report duplicate use of space. 
O Retrieve lost space. 

© Report inaccessible files. 

O Check consistency of directories. 
© List names of all files. 


Peremptorily expunge a file and its space from a file system. Used to repair 
damaged file systems. 


Force all outstanding I/O on the system to completion. Used to shut down 
gracefully. 


1.9. Accounting 


The timing information on which the reports are based can be manually cleared or shut off 


completely. 
O AC 


OSA 


Publish cumulative connect time report. 
© Connect time by user or by day. 
O For all users or for selected users. 


Publish Shell accounting report. Gives usage information on each command 
executed. 

O Number of times used. 

O Total system time, user time and elapsed time. 

© Optional averages and percentages. 

O Sorting on various fields. 


1.10. Communication 


O MAIL 


Mail a message to one or more users. Also used to read and dispose of incom- 
ing mail. The presence of mail is announced by LOGIN and optionally by SH. 
© Each message can be disposed of individually. 

O Messages can be saved in files or forwarded. 


C1 CALENDAR Automatic reminder service for events of today and tomorrow. 


O WRITE 
O WALL 
O MESG 
OCU 


0 UUCP 


Establish direct terminal communication with another user. 
Write to all users. 
Inhibit receipt of messages from WRITE and WALL. 


Call up another time-sharing system. 

O Transparent interface to remote machine. 

O File transmission. 

O Take remote input from local file or put remote output into local file. 
O Remote system need not be UNIX/32V. 


UNIX to UNIX copy. 

O Automatic queuing until line becomes available and remote machine is up. 
O Copy between two remote machines. 

O Differences, mail, etc., between two machines. 


1.11. Basic Program Development Tools 


Some of these utilities are used as integral parts of the higher level languages described in sec- 


tion 2. 
O AR 


Maintain archives and libraries. Combines several files into one for house- 
keeping efficiency. 
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OAS 


0 Library 


O ADB 


OOD 


O LD 


O LORDER 


O NM 


O Create new archive. 

O Update archive by date. 
© Replace or delete files. 
O Print table of contents. 
O Retrieve from archive. 


Assembler. 
O Creates object program consisting of 
code, normally read-only and sharable, 
initialized data or read-write code, 
uninitialized data. 
© Relocatable object code is directly executable without further transforma- 
tion. 
O Object code normally includes a symbol table. 
O “Conditional jump” instructions become branches or branches plus jumps 
depending on distance. 


The basic run-time library. These routines are used freely by all software. 

O Buffered character-by-character I/O. 

O Formatted input and output conversion (SCANF and PRINTF) for stan- 
dard input and output, files, in-memory conversion. 

O Storage allocator. 

O Time conversions. 

© Number conversions. 

O Password encryption. 

O Quicksort. 

O Random number generator. 

O Mathematical function library, including trigonometric functions and 
inverses, exponential, logarithm, square root, bessel functions. 


Interactive debugger. 
O Postmortem dumping. 
O Examination of arbitrary files, with no limit on size. 
O Interactive breakpoint debugging with the debugger as a separate process. 
O Symbolic reference to local and global variables. 
O Stack trace for C programs. 
O Output formats: 
1-, 2-, or 4-byte integers in octal, decimal, or hex 
single and double floating point 
character and string 
disassembled machine instructions 
O Patching. 
O Searching for integer, character, or floating patterns. 


Dump any file. Output options include any combination of octal or decimal or 
hex by words, octal by bytes, ASCII, opcodes, hexadecimal. 
O Range of dumping is controllable. 


Link edit. Combine relocatable object files. Insert required routines from 
specified libraries. 
O Resulting code is sharable by default. 


Places object file names in proper order for loading, so that files depending on 
others come after them. 


Print the namelist (symbol table) of an object program. Provides control over 
the style and order of names that are printed. 


0 SIZE 
O STRIP 


O TIME 
O PROF 


O MAKE 
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Report the memory requirements of one or more object files. 


Remove the relocation and symbol table information from an object file to 
save space. 


Run a command and report timing information on it. 


Construct a profile of time spent per routine from statistics gathered by time- 
sampling the execution of a program. 
O Subroutine call frequency and average times for C programs. 


Controls creation of large programs. Uses a control file specifying source file 
dependencies to make new version; uses time last changed to deduce minimum 
amount of work necessary. 

O Knows about CC, YACC, LEX, etc. 


1.12. UNIX/32V Programmer’s Manual 


O Manual 


O MAN 


Machine-readable version of the UNIX/32V Programmer’s Manual. 

O System overview. 

O All commands. 

O All system calls. 

O All subroutines in C and assembler libraries. 

O All devices and other special files. 

O Formats of file system and kinds of files known to system software. 
O Boot and maintenance procedures. 


Print specified manual section on your terminal. 


1.13. Computer-Aided Instruction 


O LEARN 


A program for interpreting CAI scripts, plus scripts for learning about 

UNIX/32V by using it. 

O Scripts for basic files and commands, editor, advanced files and commands, 
EQN, MS macros, C programming language. 


2. Languages 


2.1. The C Language 


OCC 


Compile and/or link edit programs in the C language. The UNIX/32V operat- 
ing system, most of the subsystems and C itself are written in C. For a full 
description of C, read The C Programming Language, Brian W. Kernighan 
and Dennis M. Ritchie, Prentice-Hall, 1978. 

O General purpose language designed for structured programming. 

O Data types include character, integer, float, double, pointers to all types, 
functions returning above types, arrays of all types, structures and unions of 
all types. 

O Operations intended to give machine-independent control of full machine 
facility, including to-memory operations and pointer arithmetic. 

O Macro preprocessor for parameterized code and inclusion of standard files. 

O All procedures recursive, with parameters by value. 

O Machine-independent pointer manipulation. 

O Object code uses full addressing capability of the VAX-11. 

O Runtime library gives access to all system facilities. 

O Definable data types. 
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O LINT 


OCB 


2.2. Fortran 


OF?77 


O RATFOR 


O STRUCT 


O Block structure 


Verifier for C programs. Reports questionable or nonportable usage such as: 
Mismatched data declarations and procedure interfaces. 
Nonportable type conversions. 
Unused variables, unreachable code, no-effect operations. 
Mistyped pointers. 
Obsolete syntax. 
O Full cross-module checking of separately compiled programs. 


A beautifier for C programs. Does proper indentation and placement of 
braces. 


A full compiler for ANSI Standard Fortran 77. 

© Compatible with C and supporting tools at object level. 

O Optional source compatibility with Fortran 66. 

O Free format source. 

O Optional subscript-range checking, detection of uninitialized variables. 

O All widths of arithmetic: 2- and 4-byte integer; 4- and 8-byte real; 8- and 
16-byte complex. 


Ratfor adds rational control structure a la C to Fortran. 

© Compound statements. 

O If-else, do, for, while, repeat-until, break, next statements. 
O Symbolic constants. 

O File insertion. 

O Free format source 

O Translation of relationals like >, >=. 

O Produces genuine Fortran to carry away. 

O May be used with F77. 


Converts ordinary ugly Fortran into structured Fortran (i.e., Ratfor), using 
statement grouping, if-else, while, for, repeat-until. 


2.3. Other Algorithmic Languages 


O DC 


OBC 


Interactive programmable desk calculator. Has named storage locations as 
well as conventional stack for holding integers or programs. 
O Unlimited precision decimal arithmetic. 
O Appropriate treatment of decimal fractions. 
O Arbitrary input and output radices, in particular binary, octal, decimal and 
hexadecimal. 
O Reverse Polish operators: 
+—*/ 
remainder, power, square root, 
load, store, duplicate, clear, 
print, enter program text, execute. 


A C-like interactive interface to the desk calculator DC. 

O All the capabilities of DC with a high-level syntax. 

O Arrays and recursive functions. 

O Immediate evaluation of expressions and evaluation of functions upon call. 
O Arbitrary precision elementary functions: exp, sin, cos, atan. 

© Go-to-less programming. 
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2.4. Macroprocessing 


0 M4 A general purpose macroprocessor. 
O Stream-oriented, recognizes macros anywhere in text. 
O Syntax fits with functional syntax of most higher-level languages. 
© Can evaluate integer arithmetic expressions. 


2.5. Compiler-compilers 


O YACC An LR(1)-based compiler writing system. During execution of resulting 
parsers, arbitrary C functions may be called to do code generation or semantic 
actions. 


O BNF syntax specifications. 
O Precedence relations. 
O Accepts formally ambiguous grammars with non-BNF resolution rules. 


O LEX Generator of lexical analyzers. Arbitrary C functions may be called upon iso- 
lation of each lexical token. 
O Full regular expression, plus left and right context dependence. 
O Resulting lexical analysers interface cleanly with YACC parsers. 


3. Text Processing 
3.1. Document Preparation 


O ED Interactive context editor. Random access to all lines of a file. 

O Find lines by number or pattern. Patterns may include: specified charac- 
ters, don’t care characters, choices among characters, repetitions of these 
constructs, beginning of line, end of line. 

O Add, delete, change, copy, move or join lines. 

O Permute or split contents of a line. 

O Replace one or all instances of a pattern within a line. 

© Combine or split files. 

O Escape to Shell (command language) during editing. 

O Do any of above operations on every pattern-selected line in a given range. 

O Optional encryption for extra security. 


oO PTX Make a permuted (key word in context) index. 
O SPELL Look for spelling errors by comparing each word in a document against a word 
list. 


© 25,000-word list includes proper names. 
O Handles common prefixes and suffixes. 
O Collects words to help tailor local spelling lists. 


O LOOK Search for words in dictionary that begin with specified prefix. 
O CRYPT Encrypt and decrypt files for security. 


3.2. Document Formatting 


O TROFF 


O NROFF Advanced typesetting. TROFF drives a Graphic Systems phototypesetter; 
NROFF drives ascii terminals of all types. This summary was typeset using 
TROFF. TROFF and NROFF are capable of elaborate feats of formatting, 
when appropriately programmed. TROFF and NROFF accept the same input 
language. 
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© Completely definable page format keyed to dynamically planted “inter- 
rupts” at specified lines. 

O Maintains several separately definable typesetting environments (e.g., one 
for body text, one for footnotes, and one for unusually elaborate headings). 

O Arbitrary number of output pools can be combined at will. 

O Macros with substitutable arguments, and macros invocable in mid-line. 

© Computation and printing of numerical quantities. 

O Conditional execution of macros. 

O Tabular layout facility. 

O Positions expressible in inches, centimeters, ems, points, machine units or 
arithmetic combinations thereof. 

O Access to character-width computation for unusually difficult layout prob- 
lems. 

O Overstrikes, built-up brackets, horizontal and vertical line drawing. 

O Dynamic relative or absolute positioning and size selection, globally or at 
the character level. 

© Can exploit the characteristics of the terminal being used, for approximating 
special characters, reverse motions, proportional spacing, etc. 


The Graphic Systems typesetter has a vocabulary of several 102-character fonts (4 simultane- 
ously) in 15 sizes. TROFF provides terminal output for rough sampling of the product. 


NROFF will produce multicolumn output on terminals capable of reverse line feed, or through 
the postprocessor COL. 


High programming skill is required to exploit the formatting capabilities of TROFF and 
NROFF, although unskilled personnel can easily be trained to enter documents according to 
canned formats such as those provided by MS, below. TROFF and EQN are essentially ident- 
ical to NROFF and NEQN so it is usually possible to define interchangeable formats to pro- 
duce approximate proof copy on terminals before actual typesetting. The preprocessors MS, 
TBL, and REFER are fully compatible with TROFF and NROFF. 


O MS 


0 EQN 


A standardized manuscript layout package for use with NROFF/TROFF. 
This document was formatted with MS. 

© Page numbers and draft dates. 

O Automatically numbered subheads. 

O Footnotes. 

O Single or double column. 

O Paragraphing, display and indentation. 

O Numbered equations. 


A mathematical typesetting preprocessor for TROFF. Translates easily read- 
able formulas, either in-line or displayed, into detailed typesetting instruc- 
tions. Formulas are written in a style like this: 


sigma sup 2 ~=" 1 over N sum from i=1 to N ( x subi — x bar ) sup 2 


which produces: 


eee ee mag 
o N 2 xy) 


© Automatic calculation of size changes for subscripts, sub-subscripts, etc. 

© Full vocabulary of Greek letters and special symbols, such as ‘gamma’, 
‘GAMMA’, ‘integral’. 

O Automatic calculation of large bracket sizes. 

O Vertical “piling” of formulae for matrices, conditional alternatives, etc. 

O Integrals, sums, etc., with arbitrarily complex limits. 


0 NEQN 


O TBL 


O REFER 


OTC 


O COL 
1 DEROFF 
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O Diacriticals: dots, double dots, hats, bars, etc. 
O Easily learned by nonprogrammers and mathematical typists. 


A version of EQN for NROFF; accepts the same input language. Prepares for- 
mulas for display on any terminal that NROFF knows about, for example, 
those based on Diablo printing mechanism. 

O Same facilities as EQN within graphical capability of terminal. 


A preprocessor for NROFF/TROFF that translates simple descriptions of 

table layouts and contents into detailed typesetting instructions. 

© Computes column widths. 

© Handles left- and right-justified columns, centered columns and decimal- 
point alignment. 

O Places column titles. 

O Table entries can be text, which is adjusted to fit. 

O Can box all or parts of table. 


Fills in bibliographic citations in a document from a data base (not supplied). 

O References may be printed in any style, as they occur or collected at the 
end. 

O May be numbered sequentially, by name of author, etc. 


Simulate Graphic Systems typesetter on Tektronix 4014 scope. Useful for 
checking TROFF page layout before typesetting. 


Canonicalize files with reverse line feeds for one-pass printing. 


Remove all TROFF commands from input. 


C1 CHECKEQ Check document for possible errors in EQN usage. 


4. Information Handling 


O SORT 


O TSORT 
O UNIQ 


OTR 
O DIFF 


0 COMM 


O JOIN 
O GREP 


Sort or merge ASCII files line-by-line. No limit on input size. 

O Sort up or down. 

O Sort lexicographically or on numeric key. 

O Multiple keys located by delimiters or by character position. 
O May sort upper case together with lower into dictionary order. 
O Optionally suppress duplicate data. 


Topological sort — converts a partial order into a total order. 


Collapse successive duplicate lines in a file into one line. 
O Publish lines that were originally unique, duplicated, or both. 
O May give redundancy count for each line. 


Do one-to-one character translation according to an arbitrary code. 
O May coalesce selected repeated characters. 
O May delete selected characters. 


Report line changes, additions and deletions necessary to bring two files into 
agreement. 

O May produce an editor script to convert one file into another. 

O A variant compares two new versions against one old one. 


Identify common lines in two sorted files. Output in up to 3 columns shows 
lines present in first file only, present in both, and/or present in second only. 


Combine two files by joining records that have identical keys. 


Print all lines in a file that satisfy a pattern as used in the editor ED. 
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O LOOK 
O WC 
CT SED 


O AWK 


5. Graphics 


O May print all lines that fail to match. 
O May print count of hits. 
O May print first hit in each file. 


Binary search in sorted file for lines with specified prefix. 
Count the lines, “words” (blank-separated strings) and characters in a file. 


Stream-oriented version of ED. Can perform a sequence of editing operations 
on each line of an input stream of unbounded length. 

O Lines may be selected by address or range of addresses. 

O Control flow and conditional testing. 

O Multiple output streams. 

© Multi-line capability. 


Pattern scanning and processing language. Searches input for patterns, and 

performs actions on each line of input that satisfies the pattern. 

O Patterns include regular expressions, arithmetic and lexicographic condi- 
tions, boolean combinations and ranges of these. 

O Data treated as string or numeric as appropriate. 

© Can break input into fields; fields are variables. 

O Variables and arrays (with non-numeric subscripts). 

O Full set of arithmetic operators and control flow. 

O Multiple output streams to files and pipes. 

© Output can be formatted as desired. 

O Multi-line capabilities. 


The programs in this section are predominantly intended for use with Tektronix 4014 storage 


scopes. 
O GRAPH 


O SPLINE 
O PLOT 


Prepares a graph of a set of input numbers. 

O Input scaled to fit standard plotting area. 

O Abscissae may be supplied automatically. 

O Graph may be labeled. 

© Control over grid style, line style, graph orientation, etc. 


Provides a smooth curve through a set of points intended for GRAPH. 


A set of filters for printing graphs produced by GRAPH and other programs 
on various terminals. Filters provided for 4014, DASI terminals, Versatec 
printer/plotter. 


6. Novelties, Games, and Things That Didn’t Fit Anywhere Else 


O BACKGAMMON 


O BCD 

O CAL 

O CHING 

O FORTUNE 


O UNITS 


A player of modest accomplishment. 

Converts ascii to card-image form. 

Print a calendar of specified month and year. 

The I Ching. Place your own interpretation on the output. 


Presents a random fortune cookie on each invocation. Limited jar of cookies 
included. 


Convert amounts between different scales of measurement. Knows hundreds 
of units. For example, how many km/sec is a parsec/megayear? 


UNIX 32/V — Summary 1-17 


O ARITHMETIC 

Speed and accuracy test for number facts. 
0 QUIZ Test your knowledge of Shakespeare, Presidents, capitals, etc. 
O WUMP Hunt the wumpus, thrilling search in a dangerous cave. 


CO HANGMAN Word-guessing game. Uses a dictionary supplied with SPELL. 
O FISH Children’s card-guessing game. 
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The UNIX Time-Sharing System* 


D. M. Ritchie and K. Thompson 


ABSTRACT 


UNIX? is a general-purpose, multi-user, interactive operating system for 
the larger Digital Equipment Corporation PDP-11 and the Interdata 8/32 com- 
puters. It offers a number of features seldom found even in larger operating 
systems, including 


i A hierarchical file system incorporating demountable volumes, 
ii Compatible file, device, and inter-process I/O, 

iii The ability to initiate asynchronous processes, 

iv System command language selectable on a per-user basis, 

Vv Over 100 subsystems including a dozen languages, 

vi High degree of portability. 


This paper discusses the nature and implementation of the file system and of 
the user command interface. 


1. INTRODUCTION 


There have been four versions of the UNIX time-sharing system. The earliest (circa 
1969-70) ran on the Digital Equipment Corporation PDP-7 and -9 computers. The second ver- 
sion ran on the unprotected PDP-11/20 computer. The third incorporated multiprogramming 
and ran on the PDP-11/34, /40, /45, /60, and /70 computers; it is the one described in the pre- 
viously published version of this paper, and is also the most widely used today. This paper 
describes only the fourth, current system that runs on the PDP-11/70 and the Interdata 8/32 
computers. In fact, the differences among the various systems is rather small; most of the 
revisions made to the originally published version of this paper, aside from those concerned 
with style, had to do with details of the implementation of the file system. 


Since PDP-11 UNIX became operational in February, 1971, over 600 installations have 
been put into service. Most of them are engaged in applications such as computer science 
education, the preparation and formatting of documents and other textual material, the collec- 
tion and processing of trouble data from various switching machines within the Bell System, 
and recording and checking telephone service orders. Our own installation is used mainly for 
research in operating systems, languages, computer networks, and other topics in computer 
science, and also for document preparation. 


Perhaps the most important achievement of UNIX is to demonstrate that a powerful 
operating system for interactive use need not be expensive either in equipment or in human 
effort: it can run on hardware costing as little as $40,000, and less than two man-years were 
spent on the main system software. We hope, however, that users find that the most 


* Copyright 1974, Association for Computing Machinery, Inc., reprinted by permission. This is a revised 
version of an article that appeared in Communications of the ACM, 17, No. 7 (July 1974), pp. 365-375. That 
article was a revised version of a paper presented at the Fourth ACM Symposium on Operating Systems Prin- 
ciples, IBM Thomas J. Watson Research Center, Yorktown Heights, New: York, October 15-17, 1973. 

+ UNIX is a trademark of Bell Laboratories. 
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important characteristics of the system are its simplicity, elegance, and ease of use. 


Besides the operating system proper, some major programs available under UNIX are 


C compiler 

Text editor based on QED! 

Assembler, linking loader, symbolic debugger 

Phototypesetting and equation setting programs”? 

Dozens of languages including Fortran 77, Basic, Snobol, APL, Algol 68, M6, TMG, 
Pascal 


There is a host of maintenance, utility, recreation and novelty programs, all written locally. 
The UNIX user community, which numbers in the thousands, has contributed many more pro- 
grams and languages. It is worth noting that the system is totally self-supporting. All UNIX 
software is maintained on the system; likewise, this paper and all other documents in this 
issue were generated and formatted by the UNIX editor and text formatting programs. 


II. HARDWARE AND SOFTWARE ENVIRONMENT 


The PDP-11/70 on which the Research UNIX system is installed is a 16-bit word (8-bit 
byte) computer with 768K bytes of core memory; the system kernel occupies 90K bytes about 
equally divided between code and data tables. This system, however, includes a very large 
number of device drivers and enjoys a generous allotment of space for I/O buffers and system 
tables; a minimal system capable of running the software mentioned above can require as little 
as 96K bytes of core altogether. There are even larger installations; see the description of the 
PWB/UNIX systems,*? for example. There are also much smaller, though somewhat restricted, 
versions of the system.? 


Our own PDP-11 has two 200-Mb moving-head disks for file system storage and swap- 
ping. There are 20 variable-speed communications interfaces attached to 300- and 1200-baud 
data sets, and an additional 12 communication lines hard-wired to 9600-baud terminals and 
satellite computers. There are also several 2400- and 4800-baud synchronous communication 
interfaces used for machine-to-machine file transfer. Finally, there is a variety of miscellane- 
ous devices including nine-track magnetic tape, a line printer, a voice synthesizer, a photo- 
typesetter, a digital switching network, and a chess machine. 


The preponderance of UNIX software is written in the abovementioned C language.® 
Early versions of the operating system were written in assembly language, but during the sum- 
mer of 1973, it was rewritten in C. The size of the new system was about one-third greater 
than that of the old. Since the new system not only became much easier to understand and to 
modify but also included many functional improvements, including multiprogramming and the 
ability to share reentrant code among several user programs, we consider this increase in size 
quite acceptable. 


III. THE FILE SYSTEM 


The most important role of the system is to provide a file system. From the point of 
view of the user, there are three kinds of files: ordinary disk files, directories, and special files. 


3.1 Ordinary files 


A file contains whatever information the user places on it, for example, symbolic or 
binary (object) programs. No particular structuring is expected by the system. A file of text 
consists simply of a string of characters, with lines demarcated by the newline character. 
Binary programs are sequences of words as they will appear in core memory when the pro- 
gram starts executing. A few user programs manipulate files with more structure; for example, 
the assembler generates, and the loader expects, an object file in a particular format. How- 
ever, the structure of files is controlled by the programs that use them, not by the system. 
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3.2 Directories 


Directories provide the mapping between the names of files and the files themselves, and 
thus induce a structure on the file system as a whole. Each user has a directory of his own 
files; he may also create subdirectories to contain groups of files conveniently treated together. 
A directory behaves exactly like an ordinary file except that it cannot be written on by 
unprivileged programs, so that the system controls the contents of directories. However, any- 
one with appropriate permission may read a directory just like any other file. 


The system maintains several directories for its own use. One of these is the root direc- 
tory. All files in the system can be found by tracing a path through a chain of directories 
until the desired file is reached. The starting point for such searches is often the root. Other 
system directories contain all the programs provided for general use; that is, all the 
commands. As will be seen, however, it is by no means necessary that a program reside in one 
of these directories for it to be executed. 


Files are named by sequences of 14 or fewer characters. When the name of a file is 
specified to the system, it may be in the form of a path name, which is a sequence of directory 
names separated by slashes, “/”, and ending in a file name. If the sequence begins with a 
slash, the search begins in the root directory. The name /alpha/beta/gamma causes the 
system to search the root for directory alpha, then to search alpha for beta, finally to find 
gamma in beta. gamma may be an ordinary file, a directory, or a special file. As a limit- 
ing case, the name “/” refers to the root itself. 


A path name not starting with “/” causes the system to begin the search in the user’s 
current directory. Thus, the name alpha/beta specifies the file named beta in subdirectory 
alpha of the current directory. The simplest kind of name, for example, alpha, refers to a 
file that itself is found in the current directory. As another limiting case, the null file name 
refers to the current directory. 


The same non-directory file may appear in several directories under possibly different 
names. This feature is called linking; a directory entry for a file is sometimes called a link. 
The UNIX system differs from other systems in which linking is permitted in that all links to 
a file have equal status. That is, a file does not exist within a particular directory; the direc- 
tory entry for a file consists merely of its name and a pointer to the information actually 
describing the file. Thus a file exists independently of any directory entry, although in prac- 
tice a file is made to disappear along with the last link to it. 


6c 99 


Each directory always has at least two entries. The name “.” in each directory refers to 
the directory itself. Thus a program may read the current directory under the name “.” 
without knowing its complete path name. The name “..” by convention refers to the parent 
of the directory in which it appears, that is, to the directory in which it was created. 


The directory structure is constrained to have the form of a rooted tree. Except for the 
special entries “.” and “..”, each directory must appear as an entry in exactly one other 
directory, which is its parent. The reason for this is to simplify the writing of programs that 
visit subtrees of the directory structure, and more important, to avoid the separation of por- 
tions of the hierarchy. If arbitrary links to directories were permitted, it would be quite 
difficult to detect when the last connection from the root to a directory was severed. 


3.3 Special files 


Special files constitute the most unusual feature of the UNIX file system. Each sup- 
ported I/O device is associated with at least one such file. Special files are read and written 
just like ordinary disk files, but requests to read or write result in activation of the associated 
device. An entry for each special file resides in directory /dev, although a link may be made 
to one of these files just as it may to an ordinary file. Thus, for example, to write on a mag- 
netic tape one may write on the file /dev/mt. Special files exist for each communication line, 
each disk, each tape drive, and for physical main memory. Of course, the active disks and the 
memory special file are protected from indiscriminate access. 
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There is a threefold advantage in treating I/O devices this way: file and device I/O are as 
similar as possible; file and device names have the same syntax and meaning, so that a pro- 
gram expecting a file name as a parameter can be passed a device name; finally, special files 
are subject to the same protection mechanism as regular files. 


3.4 Removable file systems 


Although the root of the file system is always stored on the same device, it is not neces- 
sary that the entire file system hierarchy reside on this device. There is a mount system 
request with two arguments: the name of an existing ordinary file, and the name of a special 
file whose associated storage volume (e.g., a disk pack) should have the structure of an 
independent file system containing its own directory hierarchy. The effect of mount is to 
cause references to the heretofore ordinary file to refer instead to the root directory of the file 
system on the removable volume. In effect, mount replaces a leaf of the hierarchy tree (the 
ordinary file) by a whole new subtree (the hierarchy stored on the removable volume). After 
the mount, there is virtually no distinction between files on the removable volume and those 
in the permanent file system. In our installation, for example, the root directory resides on a 
small partition of one of our disk drives, while the other drive, which contains the user’s files, 
is mounted by the system initialization sequence. A mountable file system is generated by 
writing on its corresponding special file. A utility program is available to create an empty file 
system, or one may simply copy an existing file system. 


There is only one exception to the rule of identical treatment of files on different dev- 
ices: no link may exist between one file system hierarchy and another. This restriction is 
enforced so as to avoid the elaborate bookkeeping that would otherwise be required to assure 
removal of the links whenever the removable volume is dismounted. 


3.5 Protection 


Although the access control scheme is quite simple, it has some unusual features. Each 
user of the system is assigned a unique user identification number. When a file is created, it 
is marked with the user ID of its owner. Also given for new files is a set of ten protection bits. 
Nine of these specify independently read, write, and execute permission for the owner of the 
file, for other members of his group, and for all remaining users. 


If the tenth bit is on, the system will temporarily change the user identification 
(hereafter, user ID) of the current user to that of the creator of the file whenever the file is 
executed as a program. This change in user ID is effective only during the execution of the 
program that calls for it. The set-user-ID feature provides for privileged programs that may 
use files inaccessible to other users. For example, a program may keep an accounting file that 
should neither be read nor changed except by the program itself. If the set-user-ID bit is on 
for the program, it may access the file although this access might be forbidden to other pro- 
grams invoked by the given program’s user. Since the actual user ID of the invoker of any pro- 
gram is always available, set-user-ID programs may take any measures desired to satisfy them- 
selves as to their invoker’s credentials. This mechanism is used to allow users to execute the 
carefully written commands that call privileged system entries. For example, there is a system 
entry invokable only by the “super-user” (below) that creates an empty directory. As indi- 
cated above, directories are expected to have entries for “.” and “..”. The command which 
creates a directory is owned by the super-user and has the set-user-ID bit set. After it checks 
its invoker’s authorization to create the specified directory, it creates it and makes the enitries 
for “.” and “, 


Besnices anyone may set the set-user-ID bit on one of his own files, this mechanism is 
generally available without administrative intervention. For example, this protection scheme 
easily solves the MOO accounting problem posed by “Aleph-null.’6 


The system recognizes one particular user ID (that of the ‘“super-user’”) as exempt from 
the usual constraints on file access; thus (for example), programs may be written to dump and 
reload the file system without unwanted interference from the protection system. 
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3.6 I/O calls 


The system calls to do I/O are designed to eliminate the differences between the various 
devices and styles of access. There is no distinction between “random” and “sequential” I/O, 
nor is any logical record size imposed by the system. The size of an ordinary file is deter- 
mined by the number of bytes written on it; no predetermination of the size of a file is neces- 
sary or possible. 


To illustrate the essentials of I/O, some of the basic calls are summarized below in an 
anonymous language that will indicate the required parameters without getting into the 
underlying complexities. Each call to the system may potentially result in an error return, 
which for simplicity is not represented in the calling sequence. 


To read or write a file assumed to exist already, it must be opened by the following call: 
filep = open (name, flag ) 


where name indicates the name of the file. An arbitrary path name may be given. The flag 
argument indicates whether the file is to be read, written, or “updated,” that is, read and writ- 
ten simultaneously. 


The returned value filep is called a file descriptor. It is a small integer used to identify 
the file in subsequent calls to read, write, or otherwise manipulate the file. 


To create a new file or completely rewrite an old one, there is a create system call that 
creates the given file if it does not exist, or truncates it to zero length if it does exist; create 
also opens the new file for writing and, like open, returns a file descriptor. 


The file system maintains no locks visible to the user, nor is there any restriction on the 
number of users who may have a file open for reading or writing. Although it is possible for 
the contents of a file to become scrambled when two users write on it simultaneously, in prac- 
tice difficulties do not arise. We take the view that locks are neither necessary nor sufficient, 
in our environment, to prevent interference between users of the same file. They are unneces- 
sary because we are not faced with large, single-file data bases maintained by independent 
processes. They are insufficient because locks in the ordinary sense, whereby one user is 
prevented from writing on a file that another user is reading, cannot prevent confusion when, 
for example, both users are editing a file with an editor that makes a copy of the file being 
edited. 


There are, however, sufficient internal interlocks to maintain the logical consistency of 
the file system when two users engage simultaneously in activities such as writing on the same 
file, creating files in the same directory, or deleting each other’s open files. 


Except as indicated below, reading and writing are sequential. This means that if a par- 
ticular byte in the file was the last byte written (or read), the next I/O call implicitly refers to 
the immediately following byte. For each open file there is a pointer, maintained inside the 
system, that indicates the next byte to be read or written. If n bytes are read or written, the 
pointer advances by n bytes. 


Once a file is open, the following calls may be used: 


read (filep, buffer, count) 
write (filep, buffer, count) 


n 
n 


Up to count bytes are transmitted between the file specified by filep and the byte array 
specified by buffer. The returned value n is the number of bytes actually transmitted. In 
the write case, n is the same as count except under exceptional conditions, such as I/O 
errors or end of physical medium on special files; in a read, however, n may without error be 
less than count. If the read pointer is so near the end of the file that reading count charac- 
ters would cause reading beyond the end, only sufficient bytes are transmitted to reach the 
end of the file; also, typewriter-like terminals never return more than one line of input. When 
a read call returns with n equal to zero, the end of the file has been reached. For disk files 
this occurs when the read pointer becomes equal to the current size of the file. It is possible 
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to generate an end-of-file from a terminal by use of an escape sequence that depends on the 
device used. 


Bytes written affect only those parts of a file implied by the position of the write pointer 
and the count; no other part of the file is changed. If the last byte lies beyond the end of the 
file, the file is made to grow as needed. 


To do random (direct-access) I/O it is only necessary to move the read or write pointer 
to the appropriate location in the file. 


location = lseek (filep, offset, base ) 


The pointer associated with filep is moved to a position offset bytes from the beginning of 
the file, from the current position of the pointer, or from the end of the file, depending on 
base. offset may be negative. For some devices (e.g., paper tape and terminals) seek calls 
are ignored. The actual offset from the beginning of the file to which the pointer was moved 
is returned in location. 


There are several additional system entries having to do with I/O and with the file sys- 
tem that will not be discussed. For example: close a file, get the status of a file, change the 
protection mode or the owner of a file, create a directory, make a link to an existing file, delete 
a file. 


IV. IMPLEMENTATION OF THE FILE SYSTEM 


As mentioned in Section 3.2 above, a directory entry contains only a name for the associ- 
ated file and a pointer to the file itself. This pointer is an integer called the i-number (for 
index number) of the file. When the file is accessed, its i-number is used as an index into a 
system table (the i-list) stored in a known part of the device on which the directory resides. 
The entry found thereby (the file’s t-node) contains the description of the file: 


i the user and group-ID of its owner 

ii its protection bits 

iii © the physical disk or tape addresses for the file contents 

iv _ its size 

Vv time of creation, last use, and last modification 

vi‘ the number of links to the file, that is, the number of times it appears in a directory 
vii a code indicating whether the file is a directory, an ordinary file, or a special file. 


The purpose of an open or create system call is to turn the path name given by the user 
into an i-number by searching the explicitly or implicitly named directories. Once a file is 
open, its device, i-number, and read/write pointer are stored in a system table indexed by the 
file descriptor returned by the open or create. Thus, during a subsequent call to read or 
write the file, the descriptor may be easily related to the information necessary to access the 
file. 


When a new file is created, an i-node is allocated for it and a directory entry is made 
that contains the name of the file and the i-node number. Making a link to an existing file 
involves creating a directory entry with the new name, copying the i-number from the original 
file entry, and incrementing the link-count field of the i-node. Removing (deleting) a file is 
done by decrementing the link-count of the i-node specified by its directory entry and erasing 
the directory entry. If the link-count drops to 0, any disk blocks in the file are freed and the 
i-node is de-allocated. 


The space on all disks that contain a file system is divided into a number of 512-byte 
blocks logically addressed from 0 up to a limit that depends on the device. There is space in 
the i-node of each file for 13 device addresses. For nonspecial files, the first 10 device 
addresses point at the first 10 blocks of the file. If the file is larger than 10 blocks, the 11 dev- 
ice address points to an indirect block containing up to 128 addresses of additional blocks in 
the file. Still larger files use the twelfth device address of the i-node to point to a double- 
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indirect block naming 128 indirect blocks, each pointing to 128 blocks of the file. If required, 
the thirteenth device address is a triple-indirect block. Thus files may conceptually grow to 
[ (10+128+128°+128"):512] bytes. Once opened, bytes numbered below 5120 can be read with 
a single disk access; bytes in the range 5120 to 70,656 require two accesses; bytes in the range 
70,656 to 8,459,264 require three accesses; bytes from there to the largest file (1,082,201,088) 
require four accesses. In practice, a device cache mechanism (see below) proves effective in 
eliminating most of the indirect fetches. 


The foregoing discussion applies to ordinary files. When an I/O request is made to a file 
whose i-node indicates that it is special, the last 12 device address words are immaterial, and 
the first specifies an internal device name, which is interpreted as a pair of numbers 
representing, respectively, a device type and subdevice number. The device type indicates 
which system routine will deal with I/O on that device; the subdevice number selects, for 
example, a disk drive attached to a particular controller or one of several similar terminal 
interfaces. 


In this environment, the implementation of the mount system call (Section 3.4) is quite 
straightforward. mount maintains a system table whose argument is the i-number and device 
name of the ordinary file specified during the mount, and whose corresponding value is the 
device name of the indicated special file. This table is searched for each i-number/device pair 
that turns up while a path name is being scanned during an open or create; if a match is 
found, the i-number is replaced by the i-number of the root directory and the device name is 
replaced by the table value. 


To the user, both reading and writing of files appear to be synchronous and unbuffered. 
That is, immediately after return from a read call the data are available; conversely, after a 
write the user’s workspace may be reused. In fact, the system maintains a rather compli- 
cated buffering mechanism that reduces greatly the number of I/O operations required to 
access a file. Suppose a write call is made specifying transmission of a single byte. The sys- 
tem will search its buffers to see whether the affected disk block currently resides in main 
memory; if not, it will be read in from the device. Then the affected byte is replaced in the 
buffer and an entry is made in a list of blocks to be written. The return from the write call 
may then take place, although the actual I/O may not be completed until a later time. Con- 
versely, if a single byte is read, the system determines whether the secondary storage block in 
which the byte is located is already in one of the system’s buffers; if so, the byte can be 
returned immediately. If not, the block is read into a buffer and the byte picked out. 


The system recognizes when a program has made accesses to sequential blocks of a file, 
and asynchronously pre-reads the next block. This significantly reduces the running time of 
most programs while adding little to system overhead. 


A program that reads or writes files in units of 512 bytes has an advantage over a pro- 
gram that reads or writes a single byte at a time, but the gain is not immense; it comes mainly 
from the avoidance of system overhead. If a program is used rarely or does no great volume 
of I/O, it may quite reasonably read and write in units as small as it wishes. 


The notion of the i-list is an unusual feature of UNIX. In practice, this method of organ- 
izing the file system has proved quite reliable and easy to deal with. To the system itself, one 
of its strengths is the fact that each file has a short, unambiguous name related in a simple 
way to the protection, addressing, and other information needed to access the file. It also per- 
mits a quite simple and rapid algorithm for checking the consistency of a file system, for 
example, verification that the portions of each device containing useful information and those 
free to be allocated are disjoint and together exhaust the space on the device. This algorithm 
is independent of the directory hierarchy, because it need only scan the linearly organized i- 
list. At the same time the notion of the i-list induces certain peculiarities not found in other 
file system organizations. For example, there is the question of who is to be charged for the 
space a file occupies, because all directory entries for a file have equal status. Charging the 
owner of a file is unfair in general, for one user may create a file, another may link to it, and 
the first user may delete the file. The first user is still the owner of the file, but it should be 
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charged to the second user. The simplest reasonably fair algorithm seems to be to spread the 
charges equally among users who have links to a file. Many installations avoid the issue by 
not charging any fees at all. 


V. PROCESSES AND IMAGES 


An image is a computer execution environment. It includes a memory image, general 
register values, status of open files, current directory and the like. An image is the current 
state of a pseudo-computer. 


A process is the execution of an image. While the processor is executing on behalf of a 
process, the image must reside in main memory; during the execution of other processes it 
remains in main memory unless the appearance of an active, higher-priority process forces it 
to be swapped out to the disk. 


The user-memory part of an image is divided into three logical segments. The program 
text segment begins at location 0 in the virtual address space. During execution, this segment 
is write-protected and a single copy of it is shared among all processes executing the same 
program. At the first hardware protection byte boundary above the program text segment in 
the virtual address space begins a non-shared, writable data segment, the size of which may be 
extended by a system call. Starting at the highest address in the virtual address space is a 
stack segment, which automatically grows downward as the stack pointer fluctuates. 


5.1 Processes 
Except while the system is bootstrapping itself into operation, a new process can come 
into existence only-by use of the fork system call: 
processid = fork () 


When fork is executed, the process splits into two independently executing processes. The 
two processes have independent copies of the original memory image, and share all open files. 
The new processes differ only in that one is considered the parent process: in the parent, the 
returned processid actually identifies the child process and is never 0, while in the child, the 
returned value is always 0. 


Because the values returned by fork in the parent and child process are distinguishable, 
each process may determine whether it is the parent or child. 


5.2 Pipes 
Processes may communicate with related processes using the same system read and 
write calls that are used for file-system I/O. The call: 
filep = pipe( ) 


returns a file descriptor filep and creates an inter-process channel called a pipe. This chan- 
nel, like other open files, is passed from parent to child process in the image by the fork call. 
A read using a pipe file descriptor waits until another process writes using the file descriptor 
for the same pipe. At this point, data are passed between the images of the two processes. 
Neither process need know that a pipe, rather than an ordinary file, is involved. 


Although inter-process communication via pipes is a quite valuable tool (see Section 6.2), 
it is not a completely general mechanism, because the pipe must be set up by a common 
ancestor of the processes involved. 


5.3 Execution of programs 
Another major system primitive is invoked by 


execute (file, arg,, arg,, -.- , arg.) 


which requests the system to read in and execute the program named by file, passing it string 
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transliteration, selection of lines according to a pattern, sorting of the input, and encryption 
and decryption. 


6.3 Command separators; multitasking 


Another feature provided by the shell is relatively straightforward. Commands need not 
be on different lines; instead they may be separated by semicolons: 


Is; ed 
will first list the contents of the current directory, then enter the editor. 


A related feature is more interesting. If a command is followed by “&,” the shell will 
not wait for the command to finish before prompting again; instead, it is ready immediately to 
accept a new command. For example: 


as source >output & 


causes source to be assembled, with diagnostic output going to output; no matter how long 
the assembly takes, the shell returns immediately. When the shell does not wait for the com- 
pletion of a command, the identification number of the process running that command is 
printed. This identification may be used to wait for the completion of the command or to ter- 
minate it. The “&” may be used several times in a line: 


as source >output & ls >files & 


does both the assembly and the listing in the background. In these examples, an output file 
other than the terminal was provided; if this had not been done, the outputs of the various 
commands would have been intermingled. 


The shell also allows parentheses in the above operations. For example: 
(date; ls) >x & 


writes the current date and time followed by a list of the current directory onto the file x. 
The shell also returns immediately for another request. 


6.4 The shell as a command; command files 


The shell is itself a command, and may be called recursively. Suppose file tryout con- 
tains the lines: 


as source 
mv a.out testprog 
testprog 


The mv command causes the file a.out to be renamed testprog. a.out is the (binary) output 
of the assembler, ready to be executed. Thus if the three lines above were typed on the key- 
board, source would be assembled, the resulting program renamed testprog, and testprog 
executed. When the lines are in tryout, the command: 


sh <tryout 


would cause the shell sh to execute the commands sequentially. 


The shell has further capabilities, including the ability to substitute parameters and to 
construct argument lists from a specified subset of the file names in a directory. It also pro- 
vides general conditional and looping constructions. 


6.5 Implementation of the shell 


The outline of the operation of the shell can now be understood. Most of the time, the 
shell is waiting for the user to type a command. When the newline character ending the line 
is typed, the shell’s read call returns. The shell analyzes the command line, putting the argu- 
ments in a form appropriate for execute. Then fork is called. The child process, whose 
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code of course is still that of the shell, attempts to perform an execute with the appropriate 
arguments. If successful, this will bring in and start execution of the program whose name 
was given. Meanwhile, the other process resulting from the fork, which is the parent process, 
waits for the child process to die. When this happens, the shell knows the command is 
finished, so it types its prompt and reads the keyboard to obtain another command. 


Given this framework, the implementation of background processes is trivial; whenever a 
command line contains “&,” the shell merely refrains from waiting for the process that it 
created to execute the command. 


Happily, all of this mechanism meshes very nicely with the notion of standard input and 
output files. When a process is created by the fork primitive, it inherits not only the memory 
image of its parent but also all the files currently open in its parent, including those with file 
descriptors 0, 1, and 2. The shell, of course, uses these files to read command lines and to 
write its prompts and diagnostics, and in the ordinary case its children—the command 
programs—inherit them automatically. When an argument with “<” or “>” is given, however, 
the offspring process, just before it performs execute, makes the standard I/O file descriptor 
(0 or 1, respectively) refer to the named file. This is easy because, by agreement, the smallest 
unused file descriptor is assigned when a new file is opened (or created); it is only necessary 
to close file 0 (or 1) and open the named file. Because the process in which the command pro- 
gram runs simply terminates when it is through, the association between a file specified after 
“<” or “>” and file descriptor 0 or 1 is ended automatically when the process dies. Therefore 
the shell need not know the actual names of the files that are its own standard input and out- 
put, because it need never reopen them. 


Filters are straightforward extensions of standard I/O redirection with pipes used instead 
of files. 


In ordinary circumstances, the main loop of the shell never terminates. (The main loop 
includes the branch of the return from fork belonging to the parent process; that is, the 
branch that does a wait, then reads another command line.) The one thing that causes the 
shell to terminate is discovering an end-of-file condition on its input file. Thus, when the 
shell is executed as a command with a given input file, as in: 


sh <comfile 


the commands in comfile will be executed until the end of comfile is reached; then the 
instance of the shell invoked by sh will terminate. Because this shell process is the child of 
another instance of the shell, the wait executed in the latter will return, and another com- 
mand may then be processed. 


6.6 Initialization 


The instances of the shell to which users type commands are themselves children of 
another process. The last step in the initialization of the system is the creation of a single 
process and the invocation (via execute) of a program called init. The role of init is to 
create one process for each terminal channel. The various subinstances of init open the 
appropriate terminals for input and output on files 0, 1, and 2, waiting, if necessary, for carrier 
to be established on dial-up lines. Then a message is typed out requesting that the user log 
in. When the user types a name or other identification, the appropriate instance of init wakes 
up, receives the log-in line, and reads a password file. If the user’s name is found, and if he is 
able to supply the correct password, init changes to the user’s default current directory, sets 
the process’s user ID to that of the person logging in, and performs an execute of the shell. 
At this point, the shell is ready to receive commands and the logging-in protocol is complete. 


Meanwhile, the mainstream path of init (the parent of all the subinstances of itself that 
will later become shells) does a wait. If one of the child processes terminates, either because 
a shell found an end of file or because a user typed an incorrect name or password, this path 
of init simply recreates the defunct process, which in turn reopens the appropriate input and 
output files and types another log-in message. Thus a user may log out simply by typing the 
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arguments arg,, argo,..., arg,. All the code and data in the process invoking execute is 
replaced from the file, but open files, current directory, and inter-process relationships are 
unaltered. Only if the call fails, for example because file could not be found or because its 
execute-permission bit was not set, does a return take place from the execute primitive; it 
resembles a “jump” machine instruction rather than a subroutine call. 


5.4 Process synchronization 


Another process control system call: 
processid = wait (status ) 


causes its caller to suspend execution until one of its children has completed execution. Then 
wait returns the processid of the terminated process. An error return is taken if the calling 
process has no descendants. Certain status from the child process is also available. 


5.5 Termination 
Lastly: 


exit (status ) 


terminates a process, destroys its image, closes its open files, and generally obliterates it. The 
parent is notified through the wait primitive, and status is made available to it. Processes 
may also terminate as a result of various illegal actions or user-generated signals (Section VII 
below). 


VI. THE SHELL 


For most users, communication with the system is carried on with the aid of a program 
called the shell. The shell is a command-line interpreter: it reads lines typed by the user and 
interprets them as requests to execute other programs. (The shell is described fully else- 
where,® so this section will discuss only the theory of its operation.) In simplest form, a com- 
mand line consists of the command name followed by arguments to the command, all 
separated by spaces: 


command arg, arg, ... arg, 


The shell splits up the command name and the arguments into separate strings. Then a file 
with name command is sought; command may be a path name including the “/” character 
to specify any file in the system. If command is found, it is brought into memory and exe- 
cuted. The arguments collected by the shell are accessible to the command. When the com- 
mand is finished, the shell resumes its own execution, and indicates its readiness to accept 
another command by typing a prompt character. 


If file command cannot be found, the shell generally prefixes a string such as /bin/ to 
command and attempts again to find the file. Directory /bin contains commands intended 
to be generally used. (The sequence of directories to be searched may be changed by user 
request.) 


6.1 Standard I/O 


The discussion of I/O in Section III above seems to imply that every file used by a pro- 
gram must be opened or created by the program in order to get a file descriptor for the file. 
Programs executed by the shell, however, start off with three open files with file descriptors 0, 
1, and 2. As such a program begins execution, file 1 is open for writing, and is best under- 
stood as the standard output file. Except under circumstances indicated below, this file is the 
user’s terminal. Thus programs that wish to write informative information ordinarily use file 
descriptor 1. Conversely, file 0 starts off open for reading, and programs that wish to read 
messages typed by the user read this file. 
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The shell is able to change the standard assignments of these file descriptors from the 
user’s terminal printer and keyboard. If one of the arguments to a command is prefixed by 
“>” file descriptor 1 will, for the duration of the command, refer to the file named after the 
“>”, For example: 


ls 


ordinarily lists, on the typewriter, the names of the files in the current directory. The com- 
mand: 


ls >there 


creates a file called there and places the listing there. Thus the argument >there means 
“place output on there.” On the other hand: 


ed 


ordinarily enters the editor, which takes requests from the user via his keyboard. The com- 
mand 


ed <script 


interprets script as a file of editor commands; thus <script means “‘take input from script.” 


Although the file name following “<” or “>” appears to be an argument to the com- 
mand, in fact it is interpreted completely by the shell and is not passed to the command at 
all. Thus no special coding to handle I/O redirection is needed within each command; the 
command need merely use the standard file descriptors 0 and 1 where appropriate. 


File descriptor 2 is, like ‘file 1, ordinarily associated with the terminal output stream. 
When an output-diversion request with “>” is specified, file 2 remains attached to the termi- 
nal, so that commands may produce diagnostic messages that do not silently end up in the 
output file. 


6.2 Filters 


An extension of the standard I/O notion is used to direct output from one command to 
the input of another. A sequence of commands separated by vertical bars causes the shell to 
execute all the commands simultaneously and to arrange that the standard output of each 
command be delivered to the standard input of the next command in the sequence. Thus in 
the command line: 


Is | pr -2 | opr 


Is lists the names of the files in the current directory; its output is passed to pr, which 
paginates its input with dated headings. (The argument “-2” requests double-column output.) 
Likewise, the output from pr is input to opr; this command spools its input onto a file for 
off-line printing. 

This procedure could have been carried out more clumsily by: 


ls >temp1 
pr -—2 <temp1 >temp2 
opr <temp2 


followed by removal of the temporary files. In the absence of the ability to redirect output 
and input, a still clumsier method would have been to require the ls command to accept user 
requests to paginate its output, to print in multi-column format, and to arrange that its out- 
put be delivered off-line. Actually it would be surprising, and in fact unwise for efficiency rea- 
sons, to expect authors of commands such as Is to provide such a wide variety of output 
options. 


A program such as pr which copies its standard input to its standard output (with pro- 
cessing) is called a filter. Some filters that we have found useful perform character 
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end-of-file sequence to the shell. 


6.7 Other programs as shell 


The shell as described above is designed to allow users full access to the facilities of the 
system, because it will invoke the execution of any program with appropriate protection mode. 
Sometimes, however, a different interface to the system is desirable, and this feature is easily 
arranged for. 


Recall that after a user has successfully logged in by supplying a name and password, 
init ordinarily invokes the shell to interpret command lines. The user’s entry in the password 
file may contain the name of a program to be invoked after log-in instead of the shell. This 
program is free to interpret the user’s messages in any way it wishes. 


For example, the password file entries for users of a secretarial editing system might 
specify that the editor ed is to be used instead of the shell. Thus when users of the editing 
system log in, they are inside the editor and can begin work immediately; also, they can be 
prevented from invoking programs not intended for their use. In practice, it has proved desir- 
able to allow a temporary escape from the editor to execute the formatting program and other 
utilities. 

Several of the games (e.g., chess, blackjack, 3D tic-tac-toe) available on the system illus- 
trate a much more severely restricted environment. For each of these, an entry exists in the 
password file specifying that the appropriate game-playing program is to be invoked instead of 
the shell. People who log in as a player of one of these games find themselves limited to the 
game and unable to investigate the (presumably more interesting) offerings of the UNIX sys- 
tem as a whole. 


VII. TRAPS 


The PDP-11 hardware detects a number of program faults, such as references to non- 
existent memory, unimplemented instructions, and odd addresses used where an even address 
is required. Such faults cause the processor to trap to a system routine. Unless other 
arrangements have been made, an illegal action causes the system to terminate the process 
and to write its image on file core in the current directory. A debugger can be used to deter- 
mine the state of the program at the time of the fault. 


Programs that are looping, that produce unwanted output, or about which the user has 
second thoughts may be halted by the use of the interrupt signal, which is generated by typ- 
ing the “delete” character. Unless special action has been taken, this signal simply causes the 
program to cease execution without producing a core file. There is also a quit signal used to 
force an image file to be produced. Thus programs that loop unexpectedly may be halted and 
the remains inspected without prearrangement. 


The hardware-generated faults and the interrupt and quit signals can, by request, be 
either ignored or caught by a process. For example, the shell ignores quits to prevent a quit 
from logging the user out. The editor catches interrupts and returns to its command level. 
This is useful for stopping long printouts without losing work in progress (the editor manipu- 
lates a copy of the file it is editing). In systems without floating-point hardware, unimple- 
mented instructions are caught and floating-point instructions are interpreted. 


VIII. PERSPECTIVE 


Perhaps paradoxically, the success of the UNIX system is largely due to the fact that it 
was not designed to meet any predefined objectives. The first version was written when one of 
us (Thompson), dissatisfied with the available computer facilities, discovered a little-used 
PDP-7 and set out to create a more hospitable environment. This (essentially personal) effort 
was sufficiently successful to gain the interest of the other author and several colleagues, and 
later to justify the acquisition of the PDP-11/20, specifically to support a text editing and for- 
matting system. When in turn the 11/20 was outgrown, the system had proved useful enough 
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to persuade management to invest in the PDP-11/45, and later in the PDP-11/70 and Interdata 
8/32 machines, upon which it developed to its present form. Our goals throughout the effort, 
when articulated at all, have always been to build a comfortable relationship with the machine 
and to explore ideas and inventions in operating systems and other software. We have not 
been faced with the need to satisfy someone else’s requirements, and for this freedom we are 
grateful. 


Three considerations that influenced the design of UNIX are visible in retrospect. 


First: because we are programmers, we naturally designed the system to make it easy to 
write, test, and run programs. The most important expression of our desire for programming 
convenience was that the system was arranged for interactive use, even though the original 
version only supported one user. We believe that a properly designed interactive system is 
much more productive and satisfying to use than a “batch” system. Moreover, such a system 
is rather easily adaptable to noninteractive use, while the converse is not true. 


Second: there have always been fairly severe size constraints on the system and its 
software. Given the partially antagonistic desires for reasonable efficiency and expressive 
power, the size constraint has encouraged not only economy, but also a certain elegance of 
design. This may be a thinly disguised version of the “salvation through suffering” philoso- 
phy, but in our case it worked. 


Third: nearly from the start, the system was able to, and did, maintain itself. This fact 
is more important than it might seem. If designers of a system are forced to use that system, 
they quickly become aware of its functional and superficial deficiencies and are strongly 
motivated to correct them before it is too late. Because all source programs were always avail- 
able and easily modified on-line, we were willing to revise and rewrite the system and its 
software when new ideas were invented, discovered, or suggested by others. 


The aspects of UNIX discussed in this paper exhibit clearly at least the first two of these 
design considerations. The interface to the file system, for example, is extremely convenient 
from a programming standpoint. The lowest possible interface level is designed to eliminate 
distinctions between the various devices and files and between direct and sequential access. 
No large “access method” routines are required to insulate the programmer from the system 
calls; in fact, all user programs either call the system directly or use a small library program, 
less than a page long, that buffers a number of characters and reads or writes them all at once. 


Another important aspect of programming convenience is that there are no “control 
blocks” with a complicated structure partially maintained by and depended on by the file sys- 
tem or other system calls. Generally speaking, the contents of a program’s address space are 
the property of the program, and we have tried to avoid placing restrictions on the data struc- 
tures within that address space. 


Given the requirement that all programs should be usable with any file or device as 
input or output, it is also desirable to push device-dependent considerations into the operating 
system itself. The only alternatives seem to be to load, with all programs, routines for dealing 
with each device, which is expensive in space, or to depend on some means of dynamically 
linking to the routine appropriate to each device when it is actually needed, which is expen- 
sive either in overhead or in hardware. 


Likewise, the process-control scheme and the command interface have proved both con- 
venient and efficient. Because the shell operates as an ordinary, swappable user program, it 
consumes no “wired-down” space in the system proper, and it may be made as powerful as 
desired at little cost. In particular, given the framework in which the shell executes as a pro- 
cess that spawns other processes to perform commands, the notions of I/O redirection, back- 
ground processes, command files, and user-selectable system interfaces all become essentially 
trivial to implement. 
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Influences 


The success of UNIX lies not so much in new inventions but rather in the full exploita- 
tion of a carefully selected set of fertile ideas, and especially in showing that they can be keys 
to the implementation of a small yet powerful operating system. 


The fork operation, essentially as we implemented it, was present in the GENIE time- 
sharing system.’ On a number of points we were influenced by Multics, which suggested the 
particular form of the I/O system calls? and both the name of the shell and its general func- 
tions. The notion that the shell should create a process for each command was also suggested 
to us by the early design of Multics, although in that system it was later dropped for efficiency 
reasons. A similar scheme is used by TENEX.9 


IX. STATISTICS 


The following numbers are presented to suggest the scale of the Research UNIX opera- 
tion. Those of our users not involved in document preparation tend to use the system for pro- 
gram development, especially language work. There are few important “applications” pro- 
grams. 


Overall, we have today: 


125 user population 
33 maximum simultaneous users 
1,630 directories 
28,300 files 


301,700 512-byte secondary storage blocks used 
There is a “background” process that runs at the lowest possible priority; it is used to soak up 
any idle CPU time. It has been used to produce a million-digit approximation to the constant 


e, and other semi-infinite problems. Not counting this background work, we average daily: 


13,500 commands 


9.6 CPU hours 
230 connect hours 

62 different users 
240 log-ins 
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PART 2: GETTING STARTED 


The following four articles will help you begin using the ULTRIX-32 system quickly and pro- 
ductively. “UNIX for Beginners,” by Kernighan, is for all beginners; it’s essential. Be sure 
to read this article before going on to anything else in the ULTRIX-32 system. The article on 
mail comes next in importance, since the mail utility lets you exchange messages with other 
people using the system. And the articles on the bc and dc desk calculator utilities will get 
you started using some of the interactive math capabilities of the ULTRIX-32 system. 


UNIX for Beginners 


This article explains ULTRIX-32 system concepts and tells how to use the major features of 
the software system. If you want to get going fast, log in to an ULTRIX-32 system, and 
experiment with the commands shown in the examples as you read along. The article intro- 
duces: 


e Using dial-up and hard-wired terminals to communicate with ULTRIX-32 (UNIX) 
e Logging in 

e Using simple commands and command options 

e Creating, printing, and displaying files 

e Listing directory contents 

e Finding your way through directory hierarchies 

e Using scripts to automate command sequences 

¢ Redirecting process output to files instead of to a terminal 
e Using pipes to coordinate and combine tasks 

e Using the text formatting packages 

e Preparing a bibliography 

e Searching files for a character string 

e Programming in C and other languages: guidelines 


While not up-to-date, the UNIX reading list supplied at the end of the article is useful; many 
of the items referenced are included in this document set. 


NOTE 
ULTRIX-32 implements some commands differently from the ways 
explained in the article. Specifically: 


CTRL/C The default interrupt command. 
CTRL/U The default delete line command. 
<delete character> The default delete command. 


2-2 Introduction 


The “Mail Reference Manual,” by Shoens, offers a tutorial format, like “UNIX for 
Beginners.” It tells you how to use each feature of the mail utility, including: 


¢ Sending and receiving messages 

e Saving or disposing of old messages 

e Maintaining message folders 

e Leaving and reentering the mail utility in the middle of a job 
e Sending mail across a network 

e Using aliases to simplify message distribution 


In addition, the article on mail is a complete reference manual. It defines all mail commands, 
custom options, command-line options, and the standard message format. Mail is the default 
mailer for C Shell users. 


Desk Calculator Utilities 


ULTRIX-32 offers two desk calculator utilities: bc and dc. Both utilities can take input from 
the keyboard and from program files, and both perform mathematical functions. Bc is easier 
to use than dc, however, because it operates at a higher programming level than dc. 


BC allows you to enter data and commands in a conventional format similar to the formats of 
BASIC and C. The article entitled “BC - An Arbitrary Precision Desk-Calculator Language,” 
by Cherry and Morris, gives rules for using bc and some good examples. It explains bc: 


Math capabilities 

Precision capabilities 

Function definition and use 

One dimensional arrays 

Flow control 

Operator symbols consistent with C 

Library functions for trigonometry, logarithms, exponentiation, and Bessel functions 


“DC - An Interactive Desk Calculator,” also by Cherry and Morris, lists the rules and func- 
tions of the dc utility, but examples are few. The article explains the use of a push-down stack 
for calculations and data manipulation. Only data stored on the stack is available for opera- 
tions. The authors list commands and programming features and explain the internal 
representation and manipulation of numbers. : 


The bc utility is layered on dc: dc interprets the output of the bc compiler. This relationship 
is transparent to users, but significant if you are choosing between the two utilities. Bc is the 
practical choice for most users, because it really does resemble a desk calculator; dc is closer to 
an assembly language than a calculator, and as such it is a tool for sophisticated users. 
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UNIX For Beginners — Second Edition 


Brian W. Kernighan 


Bell Laboratories 
Murray Hill, New Jersey 07974 


INTRODUCTION 


From the user's point of view, the UNIX 
operating system is easy (o learn and use. and 
presents few of the usual impediments to getting 
the job done. [t is hard. however, for the 
beginner to know where to start. and how to 
make the best use of the facilities available. The 
purpose of this introduction is to help new users 
get used to the main ideas of the UNIX system 
and start making effective use of it quickly. 


You should have a couple of other docu- 
ments with you for easy reference as you read 
this one. The most important: is The UNIX 
Programmer's Manual. it’s often easier to tell you 
to read about something in the manual than to 
repeat its contents here. The other useful docu- 
ment is 4 Turorial [niroduction to the UNIX Text 
Editor. which will tell you how to use the editor 
to get text — programs, data, documents — into 
the computer. 


A word of warning: the UNIX system has 
become quite popular. and there are several 
major variants in widespread use. Of course 
details also change with time. So aithough the 
basic structure of UNIX and how to use it is com- 
mon to ail versions, there wiil certainly be 2 few 
things which are different on your system from 
what is described here. We have tried to minim- 
ize the problem. but be aware of it. In cases of 
doubt, this paper describes Version 7 UNIX. 


This paper has five sections: 


1. Getting Started: How to log in. how to type. 
what to do about mistakes in typing, how to 
log out. Some of this is dependent on which 
system you log into (phone numbers, for 
exampie) and what terminal you use, so this 
section must necessarily be supplemented by 
local information. 


Day-to-day Use: Things vou need every day 
to use the system effectively: generally use- 
ful commands: the file system. 


tw 
e 


3. Document Preparation: Preparing manu- 
scripts is one of the most common uses for 
UNIX systems. This section contains advices. 
but not extensive instructions on any of the 
formatting tools. 


4+. Writing Programs: UNIX is an excelent sys- 
tem for developing programs. This section 
talks about some of the tools, but again is 
mot a tutorial in any of the programming 
languages provided by the system. 


. A UNIX Reading List. An = annotated 
bibliography of documents that new users 
should be aware of. 


a 


I, GETTING STARTED 


Logging In 


You must have a UNIX login name. which 
you can get from whoever administers your sys- 
tem. You also need to know the phone number. 
unless your system uses permanently connected 
terminais. The UNIX system is capable of deai- 
ing with a wide variety of terminals: Terminet 
300°s: Execuport. TI and similar portabies: video 
(CRT) terminals like the HP2640, ete... high- 
priced graohics terminalis like the Tektronix 
4014: piotting terminals like those from GSI and 
DASI: and even the venerable Tetecype in its 
various forms. But note: UNIX is strongly 
Oriented towards devices with /ower case. If your 
terminal produces only upper case (e.g. modei 
33 Teletype. some video and portable terminals). 
life will be so difficult that you should look for 
another terminal. 


Be sure to set the switches appropriately on 
your device. Switches that mignt need to be 
adjusted include the speed. upper/lower case 
mode, full dupiex. even parity, and any others 
that local wisdom advises. Establish a connec- 
tion using whatever magic is needed for your ter- 
minal: this may invoive dialing 2 telephone call 
or merely flipping a switch. In either case, UNIX 
should type “login:”” at you. [f it types garbage. 
you may be at the wrong speed: check the 
switches. [f that fails, push the “break"” or 
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‘interrupt’ key a few times, siowly. If that fails 
to produce a login message, consuit a guru. 


_. When you get a login: message, type your 
login name in lower case. Follow it by a 
RETURN; the system will not do anything until 
you type a RETURN. If a password is required, 
you will be asked for it, and (if possible) printing 
will be turned off while you type it. Don't forget 
RETURN. 

The culmination of your login efforts is a 
“prompt character,"’ a single character that indi- 
cates that the system is ready to accept com- 
mands from you. The prompt character is usu- 
ally a dollar sign $ or a percent sign %. (You 
may also get a message of the day just before the 
ite character, or a notification that you have 
mail. 


Typing Commands 

Once you've seen the prompt character, you 
can type commands, which are requests that the 
system do something. Try typing 


date 


followed by RETURN. You should get back 
something like 


Mon Jan 16 14:17:10 EST 1978 


Don’t forget the RETURN after the command, or 
nothing will happen. If you think you're being 
ignored, type a RETURN; something should hap- 
pen. RETURN won't be mentioned again, but 
don’t forget it — it has to be there at the end of 
each line. 


Another command you might try is who, 
which tells you everyone who is currently logged 
in: 


who 
gives something like 


mb ttyOl Jan16 09:11 
ski ttyOS Jan16 09:33 
gam ttyl] Jan16 13:07 


The time is when the user logged in; ‘‘ttyxx"’ is 
the system's idea of what terminal the user is on. 


If you make a mistake typing the command 
name, and refer to a non-existent command, you 
will be told. For example, if you type 


whom 
you will be told 
whom: not found 


Of course, if you inadvertently type the name of 
some other command, it will run, with more or 
less mysterious results. 


Strange Terminal Behavior 


Sometimes you can get into a state where 
your terminal acts strangely. For example, each 
letter may be typed twice, or the RETURN may 
not cause a line feed or a return to the left mar- 
gin. You can often fix this by logging out and 
logging back in. Or you can read the description 
of. the command stty in section I of the manual. 
To get intelligent treatment of tab characters 
(which are much used in UNIX) if your terminal 
doesn’t have tabs, type the command 


stty ~tabs 


and the system will convert each tab into the 
right number of blanks for you. If your terminal 
does have computer-settable tabs, the command 
tabs will set the stops correctly for you. 


Mistakes in Typing 


If you make a typing mistake, and see it 
before RETURN has been typed, there are two 
ways to recover. The sharp-character # erases 
the last character typed: in fact successive uses of 
# erase characters back to the beginning of the 
line (but not beyond). So if you type badly, you 
can correct as you go: 


dd atte#i#te 


is the same as date. 


The at-sign @ erases all of the characters 
typed so far on the current input line, so if the 
line is irretrievably fouled up, type an @ and 
Start the line over. 


What if you must enter a sharp or at-sign as 
part of the text? If you precede either # or @ 
by a backslash \, it loses its erase meaning. So 
to enter a sharp or at-sign in something, type \# 
or \@. The system will always echo a newline at 
you after your at-sign, even if preceded by a 
backslash. Don't worry — the at-sign has been 
recorded. 


To erase a backslash, you have to type two 
sharps or two at-signs, as in \##. The backslash 
is used extensively in UNIX to indicate that the 
following character is in some way special. 


Read-ahead 


UNIX has full read-ahead, which means that 
you can type as fast as you want, whenever you 
want, even when some command is typing at 
you. If you type during output, your input char- 
acters will appear intermixed with the output 
characters, but they will be stored away and 
interpreted in the correct order. So you can type 
several commands one after another without 
waiting for the first to finish or even begin. 


Stopping a Program 


You can stop most programs by typing the 
character ““DEL'’ (perhaps called ‘“‘deiete’’ or 
“*rubout’’ on your terminal). The ‘‘interrupt’’ or 
*“break’’ key found on most terminals can also 
be used. In a few programs, like the text editor, 
DEL stops whatever the program is doing but 
leaves you in that program. Hanging up the 
phone will stop most programs. 


Logging Out 


The easiest way to log out is to hang up the 
phone. You can aiso type 


login 


and let someone else use the terminal you were 
on. It is usually not sufficient just to turn off the 
terminal. Most UNIX systems do not use 2 
time-out mechanism, so you'll be there forever 
unless you hang up. 


Mail 


When you log in, you may sometimes get 
the message 


You have mail. 


UNIX provides a postal system so you can com- 
municate with other users of the system. To 
read your mail, type (he command 


mail 


Your mail will be printed, one message at a time, 
most recent message Grst. After each message, 
mail waits for you to say what to do with it. The 
two basic responses are d, which deletes the mes- 
sage, and RETURN, which does not (so it will 
still be there the next time you read your mail- 
box). Other responses are described in the 
manual. (Earlier versions of mail do not process 
one message at 2 time, but are otherwise simi- 
lar.) 


How do you send mail to someone eise? 
Suppose it is to go to ‘joe’ (assuming ‘‘joe”’ is 
someone's login name). The easiest way is this: 

mail joe 

now ype in the text of the lever 

on @$ many lines as you like ... 

After the last line of the lener 

ype the character ‘‘conrol—d"’, 

that is, hold down “‘conrol’’ and ype 
@ fewer ‘“d**, 


And that’s i. The ‘‘controi-d’’ sequence, often 
called ‘‘EOF’* for end-of-file, is used throughout 
the system to mark the end of input from a ter- 
minal, so you might as well get used to it. 


For practice, send mail to yourseif. (This 
isn't as strange as it might sound — mail to one- 
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seif is a handy reminder mechanism.) 


There are other ways to send mail — vou 
Can send a previously prepared letter, and you 
can mail to a number of people ail at once. For 
more desails see mail(1). (The notation mail! 1) 
means the command mail in section | of the 
UNIX Programmer's Manual) 


Writing to other users 


At some point, out of the blue will come a 
message like 


Message from joe tty07... 


accompanied by a startling beep. [t means that 
Joe wants to talk to you. but uniess vou take 
explicit action you won't be able to talk back. To 
respond, type the command 
write joe 

This establishes 2 two-way communication path. 
Now whatever Joe types on his terminal will 
appear on yours and vice versa. The path is 
slow, rather like talking to the moon. (If you are 
in the middie of something, you have to get to a 
state where you can type a command. Normally, 
whatever program you are running fas to ter- 
minate or be terminated. If vou're editing, you 


can escape temporarily from the editor — read 
the editor turorial.) 


A protocol is needed to keep what you type 
from getting garbled up with what Joe types. 
Typically ie’s like this: 


Joe types write smith and waits. 

Smith types write joe and waits. 

Joe now types his message (as many lines 
as he likes). When he’s ready for a reply. 
he signals it by typing (0). which stands 
for ‘Sover’’. 

Now Smith types a repiy, also terminated 
by (o). 

This cycle repeats until someone gets 
tired; he then signals his intent to quit 
with (oa), for “over and out”. 

To terminate the conversation, each side 
Must type a ‘‘control-d’’ character alone 
on a line. (**Delete’’ aiso works.) When 
the other person types his ‘“‘controi-d"’, 
you will get the message EOF on your 
terminal. 


If you write to someone wo isn't logged in. 
or who doesn’t want to be disturbed, you'll be 
told. If the target is logged in but doesn’: answer 
after a decent interval, simply type “‘controi-d"’. 
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On-line Manual 


The UNIX Programmer's Manual is typically 
kept on-line. If you get stuck on something, and 
can’t find an expert to assist you, you can print 
on your terminal some manual section that 
might help. This is also useful for getting the 
most up-to-date information on a command. To 
print a manual section, type ‘‘“man command- 
name’’. Thus to read up on the who command, 
type 


man who 
and, of course, 
man man 


tells all about the man command. 


Computer Aided Instruction 


Your UNIX system may have available a pro- 
gram called learn, which provides computer 
aided instruction on the file system and basic 
commands, the editor, docum nat preparation, 
and even C programming. Try typing the com- 
mand 


learn 


If learn exists on your system, it will tell you 
what to do from there. 


Il, DAY-TO-DAY USE 


Creating Files — The Editor 


If you have to type a paper or a letter or a 
program, how do you get the information stored 
in the machine? Most of these tasks are done 
with the UNIX “‘text editor” ed. Since ed is 
thoroughly documented in ed(1) and explained 
in A Tutorial [Inroduction to the UNIX Text Editor, 
we won't spend any time here describing how to 
use it. All we want it for right now is to make 
some files. (A file is just a collection of informa- 
tion stored in the machine, a simplistic but ade- 
quate definition.) 


To create a file called junk with some text in 


correcting spelling mistakes, rearranging para- 
graphs and the like. Finally, you must write the 
information you have typed into a file with the 
editor command w: 


w 


ed will respond with the number of characters it 
wrote into the file junk. 


Until the w command, nothing is stored per- 
manently, so if you hang up and go home the 
information is lost.f But after w the information 
is there permanently; you can re-access it any 
time by typing 

ed junk 


Type a q command to quit the editor. (If you try 
to quit without writing, ed will print a ? to rem- 
ind you. A second q gets you out regardless.) 


Now create a second file called temp in the 
same manner. You should now have two files, 
junk and temp. 


What files are out there? 


The Is (for ‘‘list’?) command lists the names 
(not contents) of any of the files that UNIX 
knows about. If you type 


Is 
the response will be 


junk 
temp 


which are indeed the two files just created. The 
mames are sorted into alphabetical order 
automaticaily, but other variations are possible. 
For example, the command 

Is —t 


causes the files to be listed in the order in which 
they were last changed, most recent first. The 
~|l option gives a ‘‘long’” listing: 


Is —1 


will produce something like 


it, do the following: ~rw-rwo-rw- Lbwk 41 Jul 22 2:56 junk 


ed junk (invokes the text editor) ~rw-rw-rw— Lbwk 78 Jul 22 2:57 temp 
a (command to ‘‘ed", to add text) The date and time are of the last change to the 
now (ype in file. The 41 and 78 are the number of characters 


whatever text you want... 
(signals the end of adding text) 


The ‘*.*° that signals the end of adding text must 
be at the beginning of a line by itself. Don’t for- 
get it, for until it is typed, no other ed com- 
mands will be recognized — everything you type 
will be treated as text to be added. 


At this point you can do various editing 
operations on the text you typed in, such as 


(which should agree with the numbers you got 
from ed). bwk is the owner of the file, that is, 
the person who created it. The ~rw=—rw-rw> 
tells who has permission to read and write the 
file, in this case everyone. 


T This is not strictly true — if you hang up while editing, 
the data you were working on is saved in a file called 
ed.hup, which you can continue with at your next session. 


Options can be combined: Is ~lIt gives the 
same thing as ls —1l, but sorted into time order. 
You can also name the files you're interested in, 
and is will list the information about them only. 
More details can be found in Is(1). 


The use of optional arguments that begin 
with a minus sign, like ~t and lt. is a com- 
mon convention for UNIX programs. In general. 
if a program accepts such optional arguments, 
they precede any filename arguments. [t is also 
vital that you separate the various arguments 
with spaces: Is—I is noc the same as ls —L 


Printing Files 


Now that you've got a file of text, how do 
you print it so peopie can look ac it? There area 
host of programs that do that, probably more 
than are needed. 


One simple thing is to use the editor, since 
printing is often done just before making 
changes anyway. You can say 


ed junk 
1,S9 


ed will reply with the count of the characters in 
junk and then prince ail the lines in the fle. 
After you learn how to use the editor, you can 
be selective about the parts you print. 


There are times when it’s not feasible to use 
the editor for printing. For example, there is a 
limit on how big a file ed can handle (several 
thousand lines). Secondly, it will only print one 
file at a time, and sometimes you want to print 
several, one after another. So here are a couple 
of alternatives. 


First is cat, the simplest of ail the printing 
programs. cat simply prints on the terminal the 
contents of all the files named in a list. Thus 


cat junk 
prints one file, and 
cat junk temp 


prints two. The files are simply concatenated 
(hence the name ‘‘cat’’) onto the terminal. 


pr produces formatted printouts of files. As 
with cat, pr prints ail the files named in a list 
The difference is that it produces headings with 
date, time, page number and file name at the top 
of each page, and extra lines to skip over the 
fold in the paper. Thus, 


pr junk temp 


will print junk neatly, then skip to the top of a 
new page and print temp neatly. 


pr can also produce multi-column output: 
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pe 3 junk 


prints junk in 3-column format. You can use 
any reasonable number in place of °°3°° and pr 
will do its best. pr has other capabiiities as weil: 
see pr(l). 


{t snouid be noted that pr is nora formatting 
program in the sense of shuffling lines around 
and justifying margins. The true formatters are 
nroff and troff. which we will get to in the sec- 
tion on document preparation. 


There are also programs that print files on a 
high-speed printer. Look in your manual under 
opr and Ipr. Which to use depends on what 
equipment is attached to your machine. 


Shuffling Files About 


Now that vou have some files in the file sys- 
tem and some experience in printing them, you 
can try bigger things. For example. you can 
move a file from one place to another (which 
amounts (o giving it a new name), like this: 


my junk precious 


This means that what used to be “junk’’ is now 
‘**precious’’. If you do an Ils command now. vou 
will get 


precious 
temp 


Beware that if you move a file to another one 
that already exists, the already existing contents 
are lost forever. 


If you want to make 2 copy of a file (thar is. 
to have two versions of something), you can use 
the ep command: 


cp precious temol 


makes a dupiicate copy of precious in tempi. 


Finally, when you get tired of creating and 
moving files, there is a command to remove files 
from the file system. called rm. 


rm temp tempol 


will remove both of the files named. 


You will get a warning message if one of the 
named files wasn't there, but otherwise rm, like 
most UNIX commands, does its. work silently. 
There is no prompting or chatter, and error mes- 
Sages are occasionally curt. This terseness is 
sometimes disconcerting (tO newcomers, Sut 
experienced users find it desirabie. 


What’s in a Filename 


So far we have used filenames without ever 
saying what's 2 legal name, so it’s time for 3 
coupie of rules. Firs, filen::mes are limited to 
14 characters, which is enour,h to be descriptive. 
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Second, although you can use almost any charac- 
ter in a filename, common sense says you should 
stick to ones that are visible, and that you should 
probably avoid characters that might be used 
with other meanings. We have already seen, for 
example, that in the Is command. Is ~t means 
to list in time order. So if you had a file whose 
name was -t, you would have a tough time list- 
ing it by name. Besides the minus sign, there 
are other characters which have special meaning. 
To avoid pitfalls, you would do well to use only 
letters, numbers and the period until you're fam- 
iliar with the situation. 


On to some more positive suggestions. Sup- 
pose you're typing a large document like a book. 
Logically this divides into many small pieces, like 
chapters and perhaps sections. Physically it must 
be divided too, for ed will not handle really big 
files. Thus you should type the document as a 
number of files. You might have a separate file 
for each chapter, called 


chapl 
chap2 
ete... 


Or, if each chapter were broken into several files, 
you might have 


chap1.1 
chap1.2 
chap1.3 
chap2.1 
chap2.2 


eee 


You can now tell at a glance where a particular 
file fits into the whole. 


There are advantages to a systematic naming 
convention which are not obvious to the novice 
UNIX user. What if you wanted to print the 
whole book? You could say 


pr chapl.1 chapl.2 chapl.3 ...... 


but you would get tired pretty fast, and would 
probably even make mistakes. Fortunately, 
there is a shortcut. You can say 


pr chap* 


The * means ‘‘anything at all,*’ so this translates 
into “‘print all files whose names begin with 
chap’’, listed in alphabetical order. 


This shorthand notation is not a property of 
the pr command, by the way. It is system-wide, 
a service of the program that interprets com- 
mands (the ‘shell,’ sh(l)). Using that fact. 
you can see how to list the names of the files in 
the book: 


Is chap* 
produces 


chapl.1 
chapl.2 
chapl.3 


eee 


The © is not limited to the fast position in a 
filename — it can’ be anywhere and can occur 
several times. Thus 


rm *junk® *temp* 
removes all files that contain junk or temp as 
any part of their name. As a special case, * by 
itself matches every filename, so 

pr * 
prints all your files (alphabetical order), and 

rm * 
removes ail files. (You had better be very sure 
that’s what you wanted to say!) 


The * is not the only pattern-matching 
feature available. Suppose you want to print 
only chapters 1 through 4 and 9. Then you can 
say 


pr chapl12349]* 


The {...] means to match any of the characters 
inside the brackets. A range of consecutive 
letters or digits can be abbreviated, so you can 
also do this with 

pr chapl1—49]* 
Letters can also be used within brackets: [a~z] 
matches any character in the range a through z. 


The ? pattern matches any single character, 
so 


Is ? 


lists all files which have single-character names, 
and 


Ils ~1 chap?.1 


lists information about the first file of each 
chapter (chapl.1, chap2.1, etc.). 


Of these niceties, * is certainly the most use- 
ful, and you should get used to it. The others 
are frills, but worth knowing. 


If you should ever have to turn off the spe- 
cial meaning of *, ?, etc, enclose the entire 
argument in single quotes, as in 


Is se 


We'll see some more examples of this shortly. 


What's in a Filename, Continued 


When you first made that file called junk, 
how did the system know that there wasn’t 
another junk somewhere else, especially since 
the person in the next offics is also reading this 
tutorial? The answer is that generally each user 
has a private directory, which contains only the 
files that belong to him. When you log in. you 
are ‘in’ your directory. Uniess you take speciai 
action, when you create a new file, it is made in 
the directory that you are currently in; this is 
most often your own directory, and thus the file, 
is unrelated to any other file of the same name 
that might exist in someone else's directory. 


The set of all files is organized into a (usu- 
ally big) tree, with your files located several 
branches into the tree. [t is possibie for you to 
‘““waik’’ around this tree, and to find any file in 
the system, by starting at the root of the tree and 
walking along the proper set of branches. Con- 
versely, you can start where you are and walk 
toward the root. 


Let’s try the latter firs. The basic tools is 
the command pwd (‘print working directory’), 
which prints the name of the directory you are 
currently in. 


Although the details will vary according to 
the system you are on, if you give the command 
pwd, it will print something like 


/usr/your-name 


This says that you are currently in the directory 
your-name, which is in turn in the directory 
/usr, which is in turn in the root directory called 
by convention just /. (Even if it’s not called 
/usr on your system, you will get something 
analogous. Make the corresponding changes and 
read on.) 


If you now type 
ls /usr/your-name 
you should get exactly the same list of file names 
as you get from a plain Is: with no arguments, Is 
lists the contents of the current directory; given 


the name of a directory, it lists the contents of 
that directory. 


Next, try 

ls /usr 
This should print a long series of names, among 
which is your own login name your-name. On 
Many systems, usr is a directory that contains 


the directories of ail the normal users of the sys- 
tem, like you. 


The next step is to try 
Is / 
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You should get a response something like this 
(although again the details may be different): 


bin 
dey 
etc 
lib 
tmp 
us? 


This.is a collection of the basic directories of files 
that the system Know$ about; we are ac the root 
of the tree. 
Now ty 
cat /usr/your-name/junk 


(if junk is still around in your directory). The 
name 


/usr/your-name/junk 


is called the pathname of the Sle that you nor- 
maily think of as “‘junk’’. ‘“*Pathname’’ has an 
obvious meaning: it represents the full name of 
the path you have to follow from the root 
through the tree of directories to get to a particu. 
lar file. [t is a universal rule in the UNIX system 
that anywhere you can use an ordinary flename, 
you can use 2a pathname. 


Here is a picture which may make this 
clearer: 


(root) 
/\\ 
/\\ 

a 


bin ete usr dev tmp 
1 IN TIX FIN FAN 
/i\ 
ae ees 
adam eve ma 
ca as 
_/  \ junk 
junk temp 
Notice that Mary’s junk is unrelated to Eve's. 


This isn’t too exciting if all the files of 
interest aré in your own directory, but if you 
work with someone eise or ‘on several projects 
concurrently, it becomes handy indeed. For 
example, your friends can print your cook dy 
saying 


pr /usr/your-name/chap* 


Similarty, you can find out what files your neign- 
bor has by saying 


ls /ust/neighbor-name 
or make your own copy of one of his files by 


cp /usr/your-neighbor/his-file yourfile 


If your neighbor doesn’t want you poking 
around in his files, or vice versa. privacy can be 


2-10 UNIX For Beginners 


arranged. Each file and directory has read-write- 
execute permissions for the owner, a group, and 
everyone else, which can be set to control access. 
See Is(1) and chmod(1) for details. As a matter 
of observed fact, most users most of the time 
find openness of more benefit than privacy. 


As a final experiment with pathnames, try 
Is /bin /usr/bin 


Do some of the names look familiar? When you 
run a program, by typing its name after the 
prompt character, the system simply looks for a 
file of that name. It normally looks first in your 
directory (where it typically doesn’t find it), then 
in /bin and finally in /usr/bin. There is nothing 
magic about commands like cat or Is, except that 
they have been collected into a couple of places 
to be easy to find and administer. 


What if you work regularly with someone 
else on common information in his directory? 
You could just log in as your friend each time 
you want to, but you can also say “‘I want to 
work on his files instead of my own’’. This is 
done by changing the directory that you are 
currently in: 


ed /usr/your-friend 


(On some systems, cd is spelled chdir.) Now 
when you use a filename in something like cat or 
pr, it refers to the file in your friend's directory. 
Changing directories doesn’t affect any permis- 
sions associated with a file — if you couldn’t 
access a file from your own directory, changing 
to another directory won't alter that fact. Of 
course, if you forget what directory you're in, 
type 


pwd 
to find out. 


It is usually convenient to arrange your own 
files so that all the files related to one thing are 
in a directory separate.from other projects. For 
example, when you write your book, you might 
want to keep all the text in a directory called 
book. So make one with 


mkdir book 
then go to it with 
cd book 


then start typing chapters. The book is now 
found in (presumably) 


/usr/your-name/book 
To remove the directory book, type 


rm book/* 
rmdir book 


The first command removes all files from the 
directory; the second removes the empty direc- 
tory. 


You can go up one level in the tree of files 
by saying 


ed ee 
oe «99 


ee ig the name of the parent of whatever. direc- 
tory you are currently in. For completeness, ‘*.’’ 
is an alternate name for the directory you are in. 


Using Files instead of the Terminal 


Most of the commands we have seen so far 
produce output on the terminal; some, like the 
editor, also take their input from the terminal. It 
is universal in UNIX systems that the terminal 
can be replaced by a file for either or both of 
input and output. As one example, 


Is 


makes a list of files on your terminal. But if you 
say 


Ils > filelist 


a list of your files will be placed in the file filelist 
(which will be created if it doesn’t already exist, 
or overwritten if it does). The symbol > means 
**put the output on the following file, rather than 
on the terminal.’ Nothing is produced on the 
terminal. As another example, you could com- 
bine several files into one by capturing the out- 
put of cat in a file: 


cat fl {2 {3 >temp 


The symbol >> operates very much like > 
does, except that it means ‘‘add to the end of.”” 
That is, 


cat fl f2 f3 >>temp 


means to concatenate fl, f2 and f3 to the end of 
whatever is already in temp, instead of overwrit- 
ing the existing contents. As with >, if temp 
doesn’t exist, it will be created for you. 


In a similar way, the symbol < means to 
take the input for a program from the following 
file, instead of from the terminal. Thus, you 
could make up a script of commonly used editing 
commands and put them into a file called script. 
Then you can run the script on a file by saying 


ed file <script 


As another example, you can use ed to prepare a 
letter in file let, then send it to several peopie 
with 


mail adam eve mary joe < let 


Pipes 


One of the novel contributions of the UNIX 
system is the idea of a pipe. A pipe is simply 2 
way to connect the output of one program to the 
input of another program. so the two run as a 
sequence of processes — a pipeline. 


For exampie, 
prfgh 


will print the files f, g, and h, beginning each on 
a new page. Suppose you want them run 
together instead. You could say 


cat {gh >temp 
pr <temp 
rm temp 


but this is more work than necessary. Clearly 
what we want is to take the output of cat and 
connect it to the input of pr. So let us use a 
pipe: 

cat fg h{pe 


The vertical bar | means to take the output from 
cat, which would normally have gone to the ter- 
minal, and put it into pr to be neatly formatted. 


There are many other examples of pipes. 
For exampie, 


ls | pe 3 


prints a list of your files in three columns. The 
program we counts the number of lines, words 
and characters in its input, and as we saw eartier, 
who prints a list of currently-logged on people, 
one per line. Thus 


who | we 


tells how many people are logged on. And of 
course 


Is | we 
counts your files. 


Any program that reads from the terminal 
can read from a pipe instead; any program that 
writes on the terminal can drive a pipe. You can 
have as many elesnents in a pipeline as you wish. 


Many UNIX programs are written so that 
they will take their input from one or more files 
if file arguments are given; if no arguments are 
given they will read from the terminal, and thus 
can be used in pipelines. pr is one example: 


pr~3abe 


prints files a, b and ¢ in order in three columns. 
But in 


cata belpr —3 


pr prints the information coming down the pipe- 
line, still in three columns. 


UNIX For Beginners 2-11 


The Sheil 


We have already mentioned once or twice 
the mysterious “‘sheil.’” whien is in fact sh(1). 
The shell is the program that interprets what you 
type as commands and arguments. [t also looks 
after translating “. etc., into lists of Alenames, 
and <, >, and | into changes of input and out- 
put streams. 


The shell has other capabilities too. For 
example, you can run two programs with one 
command line by separating the commands with 
a semicolon: the shell recognizes the semicolon 
and breaks the line into two commands. Thus 


date; who 


does both commands before ceturning with 4 
prompc character. 


You can also have more than one program 
running simultaneously if vou wish. For example, 
if you are doing something time-consuming, like 
the editor script of an earlier section, and you 
don’t want to wait around for the results before 
starting something else, you can say 


ed file <script & 


The ampersand at the end of a command line 
Says ‘“start this command running, then take 
further commands from the terminal immedi- 
ately,” that is, don’t wait for it to complete. 
Thus the script will begin. but vou can do some- 
thing else at the same time. Of course. to keep 
the output from interfering with what you're 
doing on the terminal. it would be better to say 


ed file <script >script.out & 
which saves the output lines in a fle called 
script.out. 


When you initiate a command with &. the 
system replies with a number called the process 
flumber. which identifies the command in case 
you later want to stop it. [f you do, you can say 


kill process-number 


If you forget the process number. the command 
ps will tell you about everything you nave run- 
ning. (If you are desperate, kill 0 will kill all 
your processes.) And if you're curious about 
other people, ps a will cell vou about a@// pro- 
grams that are currently running. 


You can say 
(command-1; command-2; command-3) & 


‘to start three commands in the background, or 
you can start a background pipeline with 


command-1 | command-2 & 


Just as you can tell the editor or some simi- 
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lar program to take its input from a file instead 
of from the terminal, you can tell the shell to 
read a file to get commands. (Why not? The 
shell, after all, is just a program, albeit a clever 
one.) For instance, suppose you want to set tabs 
on your terminal, and find out the date and 
who's on the system every time you fog in. 
Then you can put the three necessary commands 
(tabs, date, who) into a file, let's call it startup, 
and then run it with 


sh startup 


This says to run the shell with the file startup as 
input. The effect is as if you had typed the con- 
tents of startup on the terminal. 


If this is to be a regular thing, you can elim- 
inate the need to type sh: simply type, once only, 
the command 


chmod +x startup 
and thereafter you need only say 
startup 


to run the sequence of commands. The 
chmod(1) command marks the file executable; 
the shell recognizes this and runs it as a 
sequence of commands. 


If you want startup to run automatically 
every time you log in, create a file in your login 
directory called .profile, and place in it the line 
startup. When the shell first gains control when 
you log in, it looks for the .profile file and does 
whatever commands it finds in it. We'll get back 
to the shell in the section on programming. 


II]. DOCUMENT PREPARATION 


UNIX systems are used extensively for docu- 
ment preparation. There are two major format- 
ting programs, that is, programs that produce a 
text with justified right margins, automatic page 
numbering and titling, automatic hyphenation, 
and the like. nroff is designed to produce output 
on terminals and line-printers. troff (pro- 
nounced ‘“‘tee-roff) instead drives a photo- 
typesetter, which produces very high quality out- 
put on photographic paper. This paper was for- 
matted with troff. _ 


Formatting Packages 


The basic idea of nroff and troff is that the 
text to be formatted contains within it ‘‘format- 
(ing commands” that indicate in detail how the 
formatted text is to look. For example, there 
might be commands that specify how long lines 
are, whether to use single or double spacing, and 
what running titles to use on each page. 


Because nroff and troff are relatively hard to 
learn to use effectively, several ‘*packages"’ of 
canned formatting requests are available to let 
you specify paragraphs, running titles, footnotes, 
multi-column output, and so on, with little effort 
and without having to learn nroff and troff. 
These packages take a modest effort to learn, but 
the rewards for using them are so great that it is 
time well spent. 


In this section, we will provide a hasty look 
at the ‘*manuscript’” package known as ms. 
Formatting requests typically consist of a period 
and two upper-case letters, such as .TL, which is 
used to introduce a title, or .PP to begin a new 
paragraph. 


A document is typed so it looks something 
like this: 


TL 

title of document 
AU 

author name 

SH 

section heading 

-PP 

paragraph ... 
another paragraph ... 
SH 

another section heading 
-PP 

etc. 


The lines that begin with a period are the for- 
matting requests. For example, .PP cails for 
Starting a new paragraph. The precise meaning 
of .PP depends on what output device is being 
used (typesetter or terminal, for instance), and 
on what publication the document will appear in. 
For example, ms normally assumes that a 
paragraph is preceded by a space (one line in 
nroff, ‘2 line in troff), and the first word is 
indented. These rules can be changed if you 
like, but they are changed by changing the 
interpretation of .PP, not by re-typing the docu- 
ment. 


To actually produce a document in standard 
format using ms, use the command 


troff —ms files ... 
for the typesetter, and 
nroff —ms files ... 


for a terminal. The —ms argument tells troff 
and nroff to use the manuscript package of for- 
matting requests. 


There are several similar packages: check 
with a local expert to determine which ones are 
in common use on your machine. 


Supporting Tools 


In addition to the basic formatiers, there is a 
host of supporting programs that heip with docu- 
ment preparation. The list in the next few para- 
graphs is far from complete, so browse through 
the manual and check with people around you 
for other possibilities. 


eqn and neqn let you integrate mathematics 
into the text of a document, in an easy-to-learn 
language that closely resembles the way you 
would speak it aloud. For example, the eqn 
input 


sum from {9 to n x sub i “=” pi over 2 
produces the output 


a“ a 
LaF 


The program tbl provides an analogous ser- 
vice for preparing tabular material; it does ail the 
computations necessary (o align complicated 
columns with elements of varying widths. 


refer prepares bibliographic citations from a 
data base, in whatever style is defined by the for- 
matting package. I: looks after ail the details of 
numbering references in sequence, filling in page 
and volume numbers, getting the author’s initials 
and the journal aame right, and so on. 


spell and typo detect possible spelling mis- 
takes in a document spell works by comparing 
the words in your document to a dictionary, 
printing those that are not in the dictionary. It 
knows enougn about English spelling to detect 
plurals and the like, so it does a very good job. 
typo looks for words which are ‘“‘unusual’’, and 
prints those. Spelling mistakes tend to be more 
unusuai, and thus show up early when the most 
unusual words are printed first. 


grep looks through a set of files for lines 
that contain a particular text pattern (rather like 
the editor's context search does, but on a bunch 
of files). For exampie, 


grep ‘IngS’ chap* 


will find ail lines that end with the letters ing in 
the files chap*. (It is almost always a good prac- 
tice to put single quotes around the pattern 
you're searching for, in case it contains charac- 
ters like * or S that have 2 special meaning to the 
sheil.) grep is often useful for finding out in 
which of a set of files the misspeiled words 
detected by spell are actually located. 


diff prints a list of the differences between 
two files, so you can compare two versions of 
something automatically (which certainly beats 
proofreading by hand). 
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we counts the words, lines and characters in 
a set of files. tr transiates characters into other 
characters, for example ic will convert upper to 
lower case and vice versa. This translates upper 
into lower: 


trA~Z az <input >ourput 


sort sorts files in a variety of ways: cref 
makes cross-references; pex makes a permuted 
index (keyword-in-context listing). sed provides 
many of the editing facilities of ed, out can apply 
them to arbitrarily long inputs. awk provides the 
ability to do both pattern matching and numecic 
computations, and to conveniently process feids 
within lines. These programs are for more 
advanced users, and they are not limited (to 
document preparation. Put them on your list of 
things to learn about. 


Most of these programs are either indepen- 
dently documented (like eqn and tbi), or are 
sufficiently simple that the description in the 
UNIX Programmer's Manual is adequate explana- 
tion. 


Hines for Preparing Documents 


Most documents go through severai versions 
(always more than you expected) before they are 
finally finished. Accordingly, you should do 
whatever possible to make the job of changing 
them easy. 


First, when you do the purely mechanical 
operations of typing, type so that subsequent 
editing will be easy. Start each sentence on a 
new line. Make lines short, and >reak lines at 
natural places, such as after commas and semi- 
colons. rather than randomly. Since most people 
change documents. by rewriting phrases and 
adding, deleting and rearranging sentences, these 
precautions simplify any editing you have to do 
later. 


Keep the individual files of 2 document 
down (oO modest size, perhaps ten to fifteen 
thousand characters. Larger files edit more 
slowly, and of course if you make a dumo mis- 
take it’s better to have ciobbered a smail file 
than a big one. Solit into files ac natural boun- 
daries in the document, for the same reasons 
that you start each sentence on a new line. 


The second aspect of making change 2asy is 
to not commit yourself to formatting desails too 
early. One of the advantages of formatting pack- 
ages like ms is that they permit you to delay 
decisions to the last possible moment. I[ndeed. 
unul a document is printed. it ts not even 
decided whether it will be typeset or put on a line 
printer. 
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As a tule of thumb, for all but the most 
trivial jobs, you should type a document in terms 
of a set of requests like .PP, and then define 
them appropriately, either by using one of the 
canned packages (the better way) or by defining 
your own nroff and troff commands. As long as 
you have entered the text in some systematic 
way, it can always be cleaned up and re- 
formatted by a judicious combination of editing 
commands and request definitions. 


IV. PROGRAMMING 


There will be no attempt made to teach any 
of the programming languages available but a 
few words of advice are in order. One of the 
reasons why the UNIX system is a productive 
programming environment is that there is 
already a rich set of tools available, and facilities 
like pipes, I/O redirection, and the capabilities of 
the shell often make it possible to do a job by 
pasting together programs that already exist 
instead of writing from scratch. 


The Shell 


The pipe mechanism lets you fabricate quite 
complicated operations out of spare parts that 
already exist. For example, the first draft of the 
spell program was (roughly) 


Cat ... collect the files 
ltr... put each word on a new line 
\tr... delete punctuation, ete. 
| sort into dictionary order 
| uniq discard duplicates 
comm print-words in text 
but not in dictionary 


More pieces have been added subsequently, but 
this goes a long way for such a small effort. 


The editor can be made to do things that 
would normally require special programs on 
other systems. For example, to list the first and 
last lines of each of a set of files, such as a book, 
you could laboriously type 


ed 
e chapl.1 
Ip 


e chap1.2 
Ip 
Sp 


etc, 


But you can do the job much more easily. One 
way is to type 


ls chap* >temp 


to get the list of filenames into a file. Then edit 
this file to make the necessary series of editing 


commands (using the global commands of ed), 
and write it into script. Now the command 


ed <script 


will produce the same output as the laborious 
hand typing. Alternately (and more easily), you 
can use the fact that the shell will perform loops, 
repeating a set of commands over and over again 
for a set of arguments: 


for { in chap* 
do 

ed Si <script 
done 


This sets the shell variable { to each file name in 
turn, then does the command. You can type this 
command at the terminal, or put it in a file for 
later execution. 


Programming the Shell 


An option often overlooked by newcomers is 
that the shell is itself a programming language, 
with variables, control flow (if-else, while, for, 
case), subroutines, and interrupt handling. Since 
there are many building-block programs, you can 
sometimes avoid writing a new program merely 
by piecing together some of the building blocks 
with shell command files. 


We will not go into any details here; exam- 
ples and rules can be found in An [ngroduction to 
the UNIX Shell, by S. R. Bourne. 


Programming in C 


If you are undertaking anything substantial, 
C is the only reasonable choice of programming 
language: everything in the UNIX system is tuned 
to it. The system itself is written in C, as are 
most of the programs that run on it. It is also a 
easy language to use once you get started. C is 
introduced and fully described in The C Program- 
ming Language by B. W. Kernighan and D. M. 
Ritchie (Prentice-Hall, 1978). Several sections 
of the manual describe the system interfaces, 
that is, how you do I/O and similar functions. 
Read UNIX Programming for more complicated 
things. 


Most input and output in C is best handled 
with the standard [I/O library, which provides a 
set of I/O functions that exist in compatible 
form on most machines that have C compilers. 
In general, it’s wisest to confine the system 
interactions in a program to the facilities pro- 
vided by this library. 


C programs that don’t depend too much on 
special features of UNIX (such as pipes) can be 
moved to other computers that have C com- 
pilers. The list of such machines grows daily; in 
addition to the original PDP-11, it currently 


includes at least Honeyweil 6000, IBM 370, 
Interdata 8/32, Data General Nova and Ectipse, 
HP 2100, Harris /7, VAX 11/780, SEL 86, and 
Zilog Z80. Calls to the standard [/O library will 
work on all of these machines. 


There are 2 number of supporting programs 
that go with C. lint checks C programs for 
potential portability problems, and detects errors 
such as mismatched argument types and unini- 
tialized variables. 

For larger programs (anything whose source 
is on more than one file) make allows you to 
specify the dependencies among the source files 
and the processing steps needed to make 2 new 
version; it then checks the times that the pieces 
were last changed and does the minimai amount 
of recompiling to create a consistent updated ver- 
sion. 


The debugger adb is useful for digging 
through the dead bodies of C programs, but is 
rather hard to learn to use effectively. The most 
effective debugging tool is still careful thought, 
coupied with judiciously placed print statements. 


The C compiler provides a limited instru- 
mentation service, so you can find out where 
programs spend their time and what parts are 
worth optimizing. Compile the routines with the 
“yp option; after the test run, use prof to print 
an execution profile. The command time will 
give you the gress run-time statistics of a pro- 
gram, but they are not super accurate or repro- 
ducible. 


Other Languages 


If you have to use Fortran, there are two 
possibilities. You might consider Ratfor, which 
gives you the decent control structures and free- 
form input that characterize C, yet lets you write 
code that is still portable to other environments. 
Bear in mind that UNIX Fortran tends to produce 
large and relatively siow-running programs. 
Furthermore, supporting software like adb, prof, 
ete., are all virtually useless with Fortran pro- 
grams. There may.aiso be a Fortran 77 compiler 
on your system. If so, this is a viabie alternative 
to Ratfor, and has the non-trivial advantage that 
it is compatiole with C and related programs. 
(The Ratfor processor and C tools can be used 
with Fortran 77 too.) 


If your application requires you to translate a 
language into a set of actions or another 
language, you are in effect building a compiler, 
though probably a smail one. In that case, you 
should be using the yace compiler-compiler, 
which helps you develop a compiler quickly. The 
lex lexical analyzer generator does the same job 
for the simpler languages that can be expressed 
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as regular expressions. [t can be used by itself. 
or as a front end to recognize inputs for a 
yace-based program. Both yace and lex require 
some sophistication to use, but the initial 2fort 
of learning them can be repaid many times over 
in programs that are easy to change later on. 


Most UNIX systems also make available 
other languages, such as Algol 68, APL, Basic. 
Lisp, Pascal, and Snobol. Whecher these are 
useful depends largely on the local environment: 
if someone cares about the language and has 
worked on it, it may be in good shape. If not. 
the odds are strong that it will be more trouble 
than it’s worth. 


Vv. UNIX READING LIST 


General: 


K. L. Thompson and D. M. Ritchie, The UNIX 
Programmer's Manual, Belli Laboratories, 1978. 
Lists commands, system routines and interfaces, 
file formats, and some of the maintenance pro- 
cedures. You can't live without this, although 
you will probably only need to read section |. 


Documents for Use with the UNIX Time-sharing 
System. Volume 2 of the Programmer’s Manual. 
This contains more extensive descriptions of 
major commands, and tutorials and reference 
manuals. All of the papers listed below are in it, 
as are descriptions of most of the programs men- 
tioned above. 


D. M. Ritchie and K. L. Thompson. ‘‘The UNIX 
Time-sharing System."* CACM, July 1974. An 
overview of the system, for people interested in 
operating systems. Worth reading by anyone 
who programs. Contains a remarkable number 
of one-sentence observations on how to do 
things right. 

The Beil System Technical Journal (BSTJ) Spe- 
cial Issue on UNIX, July/August, 1978, contains 
many papers describing recent developments. 
and some retrospective material. 


The 2nd International Conference on Software 
Engineering (October, 1976) contains severai 
papers describing the use of the Programmer's 
Workbench (PWB) version of UNIX. 


Document Preparation: 


B. W. Kernighan, ‘‘A Tutorial Introduction to 
the UNIX Text Editor’’ and ‘*Advanced Editing 
on UNIX,"* Bell Laboratories, 1978. Beginners 
need the introduction; the advanced material will 
help you get the most out of the editor. 


M. E. Lesk, “Typing Documents on UNIX,” Bell 
Laboratories, 1978. Describes the —ms macro 
package, which isolates the novice from the 
vagaries of nroff and troff, and takes care of 
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most formatting situations. If this specific pack- 
age isn’t available on your system, something 
similar probably is. The most likely alternative is 
the PWB/UNIX macro package ~mm; see your 
local guru if you use PWB/UNIX. 


B. W. Kernighan and L. L. Cherry, ‘SA System 
for Typesetting Mathematics,’’ Bell Laboratories 
Computing Science Tech. Rep. 17. 


M. E. Lesk, “Tol — A Program to Format 
Tables,’* Bell Laboratories CSTR 49, 1976. 


J. F. Ossanna, Jr.. ““NROFF/TROFF User’s 
Manual,"” Bell Laboratories CSTR 54, 1976. 
troff is the basic formatter used by —ms, eqn 
and tbl The reference manual is indispensable 
if you are going to write or maintain these or 
Similar programs. But start with: 


B. W. Kernighan, ‘“‘A TROFF Tutorial,”’ Bell 
Laboratories, 1976. An attempt to unravel the 
intricacies of troff. 


Programming: 


B. W. Kernighan and D. M. Ritchie, The C Pro- 
gramming Language, Prentice-Hall, 1978. Con- 
tains a tutorial introduction, complete discussions 
of all language features, and the reference 
manual. 


B. W. Kernighan and D. M. Ritchie, ‘‘UNIX Pro- 
gramming,”’ Bell Laboratories, 1978. Describes 
how to interface with the system from C pro- 
grams: I/O cails, signals, processes. 


S. R. Bourne, **An Introduction to the UNIX 
Shell,"” Bell Laboratories, 1978. An introduction 
and reference manual for the Version 7 shell. 
Mandatory reading if you intend to make 
effective use of the programming power of this 
shell. 


S. C. Johnson, **Yacc — Yet Another Compiler- 
Compiler,”’ Beil Laboratories CSTR 32, 1978. 


M. E. Lesk, “‘Lex — A Lexical Analyzer Gen- 
erator,’’ Bell Laboratories CSTR 39, 1975. 


S. C. Johnson, ‘‘Lint, a C Program Checker,” 
Bell Laboratories CSTR ‘65, 1977. 


S. |. Feldman, “*MAKE — A Program for Main- 
taining Computer Programs,’’ Bell Laboratories 
CSTR 57, 1977. 


J. F. Maranzano and S. R. Bourne, ‘“‘A Tutorial 
Introduction to ADB,’’ Bell Laboratories CSTR 
62, 1977. An introduction to a powerful but 
complex debugging tool. 

S. I. Feldman and P. J. Weinberger, ‘‘A Portable 


Fortran 77 Compiler,’’ Bell Laborataries, 1978. 
A full Fortran 77 for UNIX systems.. 
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MAIL REFERENCE MANUAL 


Kurt Shoens 
Revised by 
Craig Leres 


Version 2.18 


1. Introduction 


Mail provides a simple and friendly environment for sending and receiving mail. It 
divides incoming mail into its constituent messages and allows the user to deal with them in 
any order. In addition, it provides a set of ed-like commands for manipulating messages and 
sending mail. Mail offers the user simple editing capabilities to ease the composition of out- 
going messages, as well as providing the ability to define and send to names which address 
groups of users. Finally, Mail is able to send and receive messages across such networks as 
the ARPANET, UUCP, and Berkeley network. 


This document describes how to use the Mail program to send and receive messages. 
The reader is not assumed to be familiar with other message handling systems, but should be 
familiar with the UNIX! shell, the text editor, and some of the common UNIX commands. “The 
UNIX Programmer’s Manual,” “An Introduction to Csh,” and “Text Editing with Ex and Vi” 
can be consulted for more information on these topics. 


Here is how messages are handled: the mail system accepts incoming messages for you 
from other people and collects them in a file, called your system mailbox. When you login, 
the system notifies you if there are any messages waiting in your system mailbox. If you are a 
csh user, you will be notified when new mail arrives if you inform the shell of the location of 
your mailbox. On version 7 systems, your system mailbox is located in the directory 
/usr/spool/mail in a file with your login name. If your login name is “sam,” then you can 
make csh notify you of new mail by including the following line in your .cshre file: 


set mail=/usr/spool/mail/sam 


When you read your mail using Mail, it reads your system mailbox and separates that file into 
the individual messages that have been sent to you. You can then read, reply to, delete, or 
save these messages. Each message is marked with its author and the date they sent it. 


1 UNIX is a trademark of Bell Laboratories. 
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2. Common usage 


The Mail command has two distinct usages, according to whether one wants to send or 
receive mail. Sending mail is simple: to send a message to a user whose login name is, say, 
“root,” use the shell command: 


% Mail root 


then type your message. When you reach the end of the message, type an EOT (control—d) at 
the beginning of a line, which will cause Mail to echo “EOT” and return you to the Shell. 
When the user you sent mail to next logs in, he will receive the message: 


You have mail. 
to alert him to the existence of your message. 


If, while you are composing the message you decide that you do not wish to send it after 
all, you can abort the letter with a RUBOUT. Typing a single RUBOUT causes Mail to print 


(Interrupt -- one more to kill letter) 


Typing a second RUBOUT causes Mail to save your partial letter on the file “dead.letter” in 
your home directory and abort the letter. Once you have sent mail to someone, there is no 
way to undo the act, so be careful. 


The message your recipient reads will consist of the message you typed, preceded by a 
line telling who sent the message (your login name) and the date and time it was sent. 


If you want to send the same message to several other people, you can list their login 
names on the command line. Thus, 


% Mail sam bob john 

Tuition fees are due next Friday. Don’t forget!! 
<Control—d> 

EOT 

% 


will send the reminder to sam, bob, and john. 
If, when you log in, you see the message, 
You have mail. 
you can read the mail by typing simply: 
% Mail 


Mail will respond by typing its version number and date and then listing the messages you 
have waiting. Then it will type a prompt and await your command. The messages are 
assigned numbers starting with 1 — you refer to the messages with these numbers. Mail keeps 
tack of which messages are new (have been sent since you last read your mail) and read (have 
been read by you). New messages have an N next to them in the header listing and old, but 
unread messages have a U next to them. Mail keeps track of new/old and read/unread mes- 
sages by putting a header field called “Status” into your messages. 


To look at a specific message, use the type command, which may be abbreviated to sim- 
ply t. For example, if you had the following messages: 


N1 root Wed Sep 21 09:21 ”Tuition fees” 
N 2 sam Tue Sep 20 22:55 


you could examine the first message by giving the command: 
type 1 
which might cause Mail to respond with, for example: 


Message 1: 
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From root Wed Sep 21 09:21:45 1978 
Subject: Tuition fees 
Status: R 


Tuition fees are due next Wednesday. Don’t forget! 


Many Mail commands that operate on messages take a message number as an argument like 
the type command. For these commands, there is a notion of a current message. When you 
enter the Mail program, the current message is initially the first one. Thus, you can often 
omit the message number and use, for example, 


t 


to type the current message. As a further shorthand, you can type a message by simply giving 
its message number. Hence, 


1 
would type the first message. 


Frequently, it is useful to read the messages in your mailbox in order, one after another. 
You can read the next message in Mail by simply typing a newline. As a special case, you can 
type a newline as your first command to Mail to type the first message. 


If, after typing a message, you wish to immediately send a reply, you can do so with the 
reply command. Reply, like type, takes a message number as an argument. Mail then 
begins a message addressed to the user who sent you the message. You may then type in your 
letter in reply, followed by a <control-d> at the beginning of a line, as before. Mail will type 
EOT, then type the ampersand prompt to indicate its readiness to accept another command. 
In our example, if, after typing the first message, you wished to reply to it, you might give the 
command: 


reply 
Mail responds by typing: 


To: root 
Subject: Re: Tuition fees 


and waiting for you to enter your letter. You are now in the message collection mode 
described at the beginning of this section and Mail will gather up your message up to a 
control—d. Note that it copies the subject header from the original message. This is useful in 
that correspondence about a particular matter will tend to retain the same subject heading, 
making it easy to recognize. If there are other header fields in the message, the information 
found will also be used. For example, if the letter had a “To:” header listing several reci- 
pients, Mail would arrange to send your replay to the same people as well. Similarly, if the 
original message contained a “Cc:” (carbon copies to) field, Mail would send your reply to 
those users, too. Mail is careful, though, not too send the message to you, even if you appear 
in the “To:” or “Ce:” field, unless you ask to be included explicitly. See section 4 for more 
details. 


After typing in your letter, the dialog with Mail might look like the following: 


reply 
To: root 
Subject: Tuition fees 


Thanks for the reminder 
EOT 
& 
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The reply command is especially useful for sustaining extended conversations over the 
message system, with other “listening” users receiving copies of the conversation. The reply 
command can be abbreviated to r. 


Sometimes you will receive a message that has been sent to several people and wish to 
reply only to the person who sent it. Reply with a capital R replies to a message, but sends a 
copy to the sender only. 


If you wish, while reading your mail, to send a message to someone, but not as a reply to 
one of your messages, you can send the message directly with the mail command, which takes 
as arguments the names of the recipients you wish to send to. For example, to send a message 
to “frank,” you would do: 


mail frank 

This is to confirm our meeting next Friday at 4. 
EOT 

& 


The mail command can be abbreviated to m. 


Normally, each message you receive is saved in the file mbox in your login directory at 
the time you leave Mail. Often, however, you will not want to save a particular message you 
have received because it is only of passing interest. To avoid saving a message in mbox you 
can delete it using the delete command. In our example, 


delete 1 


will prevent Mail from saving message 1 (from root) in mbox. In addition to not saving 
deleted messages, Mail will not let you type them, either. The effect is to make the message 
disappear altogether, along with its number. The delete command can be abbreviated to 
simply d. ~~ 

Many features of Mail can be tailored to your liking with the set command. The set 
command has two forms, depending on whether you are setting a binary option or a valued 
option. Binary options are either on or off. For example, the “ask” option informs Mail that 
each time you send a message, you want it to prompt you for a subject header, to be included 
in the message. To set the “ask” option, you would type 


set ask 


Another useful Mail option is “hold.” Unless told otherwise, Mail moves the messages 
from your system mailbox to the file mbox in your home directory when you leave Mail. If 
you want Mail to keep your letters in the system mailbox instead, you can set the “hold” 
option. 

Valued options are values which Mail uses to adapt to your tastes. For example, the 
“SHELL” option tells Mail which shell you like to use, and is specified by 

set SHELL=/bin/csh 
for example. Note that no spaces are allowed in “SHELL=/bin/csh.” A complete list of the 
Mail options appears in section 5. 


Another important valued option is “crt.” If you use a fast video terminal, you will find 
that when you print long messages, they fly by too quickly for you to read them. With the 
“ert” option, you can make Mail print any message larger than a given number of lines by 
sending it through the paging program more. For example, most CRT users should do: 


set crt=24 


to paginate messages that will not fit on their screens. More prints a screenful of information, 
then types --MORE--. Type a space to see the next screenful. 
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Another adaptation to user needs that Mail provides is that of aliases. An alias is sim- 
ply a name which stands for one or more real user names. Mail sent to an alias is really sent 
to the list of real users associated with it. For example, an alias can be defined for the 
members of a project, so that you can send mail to the whole project by sending mail to just a 
single name. The alias command in Mail defines an alias. Suppose that the users in a pro- 
ject are named Sam, Sally, Steve, and Susan. To define an alias called “project” for them, you 
would use the Mail command: 


alias project sam sally steve susan 


The alias command can also be used to provide a convenient name for someone whose user 
name is inconvenient. For example, if a user named “Bob Anderson” had the login name 
“anderson,”” you might want to use: 


alias bob anderson 
so that you could send mail to the shorter name, “bob.” 


While the alias and set commands allow you to customize Mail, they have the draw- 
back that they must be retyped each time you enter Mail. To make them more convenient to 
use, Mail always looks for two files when it is invoked. It first reads a system wide file 
“/usr/lib/Mail.rc,” then a user specific file, “.mailrc,” which is found in the user’s home direc- 
tory. The system wide file is maintained by the system administrator and contains set com- 
mands that are applicable to all users of the system. The “.mailrc” file is usually used by each 
user to set options the way he likes and define individual aliases. For example, my .mailrc file 
looks like this: : 


set ask nosave SHELL=/bin/csh 


As you can see, it is possible to set many options in the same set command. The “nosave” 
option is described in section 5. 


Mail aliasing is implemented at the system-wide level by the mail delivery system send- 
mail. These aliases are stored in the file /usr/lib/aliases and are accessible to all users of the 
system. The lines in /usr/lib/aliases are of the form: 


alias: name,, name,, name, 


where alias is the mailing list name and the name. are the members of the list. Long lists can 
be continued onto the next line by starting the next line with a space or tab. Remember that 
you must execute the shell command newaliases after editing /usr/lib/aliases since the 
delivery system uses an indexed file created by newaliases. 


We have seen that Mail can be invoked with command line arguments which are people 
to send the message to, or with no arguments to read mail. Specifying the —f flag on the com- 
mand line causes Mail to read messages from a file other than your system mailbox. For 
example, if you have a collection of messages in the file “letters” you can use Mail to read 
them with: 


% Mail —f letters 


You can use all the Mail commands described in this document to examine, modify, or delete 
messages from your “letters” file, which will be rewritten when you leave Mail with the quit 
command described below. 


Since mail that you read is saved in the file mbox in your home directory by default, you 
can read mbox in your home directory by using simply 

% Mail —f 

Normally, messages that you examine using the type command are saved in the file 


“mbox” in your home directory if you leave Mail with the quit command described below. If 
you wish to retain a message in your system mailbox you can use the preserve command to 
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tell Mail to leave it there. The preserve command accepts a list of message numbers, just 
like type and may be abbreviated to pre. 


Messages in your system mailbox that you do not examine are normally retained in your 
system mailbox automatically. If you wish to have such a message saved in mbox without 
reading it, you may use the mbox command to have them so saved. For example, 


mbox 2 


in our example would cause the second message (from sam) to be saved in mbox when the 
quit command is executed. Mbox is also the way to direct messages to your mbox file if you 
have set the “hold” option described above. Mbox can be abbreviated to mb. 


When you have perused all the messages of interest, you can leave Mail with the quit 
command, which saves the messages you have typed but not deleted in the file mbox in your 
login directory. Deleted messages are discarded irretrievably, and messages left untouched are 
preserved in your system mailbox so that you will see them the next time you type: 


% Mail 
The quit command can be abbreviated to simply q. 


If you wish for some reason to leave Mail quickly without altering either your system 
mailbox or mbox, you can type the x command (short for exit), which will immediately 
return you to the Shell without changing anything. 


If, instead, you want to execute a Shell command without leaving Mail, you can type the 
command preceded by an exclamation point, just as in the text editor. Thus, for instance: 


Idate 
will print the current date without leaving Mail. 


Finally, the help command is available to print out a brief summary of the Mail com- 
mands, using only the single character command abbreviations. 
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3. Maintaining folders 


Mail includes a simple facility for maintaining groups of messages together in folders. 
This section describes this facility. 


To use the folder facility, you must tell Mail where you wish to keep your folders. Each 
folder of messages will be a single file. For convenience, all of your folders are kept in a single 
directory of your choosing. To tell Mail where your folder directory is, put a line of the form 


set folder=letters 


in your .mailrc file. If, as in the example above, your folder directory does not begin with a 
‘/,’ Mail will assume that your folder directory is to be found starting from your home direc- 
tory. Thus, if your home directory is /usr/person the above example told Mail to find your 
folder directory in /usr/person/letters. 


Anywhere a file name is expected, you can use a folder name, preceded with ‘+.’ For 
example, to put a message into a folder with the save command, you can use: 


save +classwork 


to save the current message in the classwork folder. If the classwork folder does not yet exist, 
it will be created. Note that messages which are saved with the save command are automati- 
cally removed from your system mailbox. 


In order to make a copy of a message in a folder without causing that message to be 
removed from your system mailbox, use the copy command, which is identical in all other 
respects to the save command. For example, 


copy +classwork 


copies the current message into the classwork folder and leaves a copy in your system mail- 
box. 


The folder command can be used to direct Mail to the contents of a different folder. 
For example, 


folder +classwork 


directs Mail to read the contents of the classwork folder. All of the commands that you can 
use on your system mailbox are also applicable to folders, including type, delete, and reply. 
To inquire which folder you are currently editing, use simply: 


folder 


To list your current set of folders, use the folders command. 


To start Mail reading one of your folders, you can use the —f option described in section 
2. For example: 


% Mail —f +classwork 


will cause Mail to read your classwork folder without looking at your system mailbox. 
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4, More about sending mail 


4.1. Tilde escapes 


While typing in a message to be sent to others, it is often useful to be able to invoke the 
text editor on the partial message, print the message, execute a shell command, or do some 
other auxiliary function. Mail provides these capabilities through tilde escapes, which consist 
of a tilde (~) at the beginning of a line, followed by a single character which indicates the func- 
tion to be performed. For example, to print the text of the message so far, use: 


p 


which will print a line of dashes, the recipients of your message, and the text of the message 
so far. Since Mail requires two consecutive RUBOUT’s to abort a letter, you can use a single 
RUBOUT to abort the output of ~p or any other ~ escape without killing your letter. 


If you are dissatisfied with the message as it stands, you can invoke the text editor on it 
using the escape 


~ 


e 


which causes the message to be copied into a temporary file and an instance of the editor to 
be spawned. After modifying the message to your satisfaction, write it out and quit the editor. 
Mail will respond by typing 
(continue) 

after which you may continue typing text which will be appended to your message, or type 
<control-d> to end the message. A standard text editor is provided by Mail. You can over- 
ride this default by setting the valued option “EDITOR” to something else. For example, you 
might prefer: 


set EDITOR=/usr/ucb/ex 


Many systems offer a screen editor as an alternative to the standard text editor, such as 
the vi editor from UC Berkeley. To use the screen, or visual editor, on your current message, 
you can use the escape, 


~ 


Vv 


“v works like “e, except that the screen editor is invoked instead. A default screen editor is 
defined by Mail. If it does not suit you, you can set the valued option “VISUAL” to the path 
name of a different editor. 


It is often useful to be able to include the contents of some file in your message; the 
escape 


“r filename 


is provided for this purpose, and causes the named file to be appended to your current mes- 
sage. Mail complains if the file doesn’t exist or can’t be read. If the read is successful, the 
number of lines and characters appended to your message is printed, after which you may 
continue appending text. The filename may contain shell metacharacters like * and ? which 
are expanded according to the conventions of your shell. 


As a special case of “r, the escape 
“d 
reads in the file “dead.letter” in your home directory. This is often useful since Mail copies 
the text of your message there when you abort a message with RUBOUT. 
To save the current text of your message on a file you may use the 
“w filename 


escape. Mail will print out the number of lines and characters written to the file, after which 
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you may continue appending text to your message. Shell metacharacters may be used in the 
filename, as in “r and are expanded with the conventions of your shell. 


If you are sending mail from within Mail’s command mode you can read a message sent 
to you into the message you are constructing with the escape: 


“m 4 
which will read message 4 into the current message, shifted right by one tab stop. You can 


name any non-deleted message, or list of messages. Messages can also be forwarded without 
shifting by a tab stop with “f. This is the usual way to forward a message. 


If, in the process of composing a message, you decide to add additional people to the list 
of message recipients, you can do so with the escape 


“t namel name? ... 


You may name as few or many additional recipients as you wish. Note that the users origi- 
nally on the recipient list will still receive the message; you cannot remove someone from the 
recipient list with “t. 


If you wish, you can associate a subject with your message by using the escape 
“s Arbitrary string of text 


which replaces any previous subject with “Arbitrary string of text.” The subject, if given, is 
sent near the top of the message prefixed with ‘‘Subject:’”’ You can see what the message will 
look like by using “p. 


For political reasons, one occasionally prefers to list certain people as recipients of car- 
bon copies of a message rather than direct recipients. The escape 


“c namel name? ... 


adds the named people to the “Cc:” list, similar to “t. Again, you can execute ~p to see what 
the message will look like. 


The recipients of the message together constitute the “To:” field, the subject the ‘‘Sub- 
ject:” field, and the carbon copies the ‘“‘Cce:” field. If you wish to edit these in ways impossible 
with the “t, “s, and “c escapes, you can use the escape 

“h 
which prints “To:” followed by the current list of recipients and leaves the cursor (or print- 
head) at the end of the line. If you type in ordinary characters, they are appended to the end 
_ of the current list of recipients. You can also use your erase character to erase back into the 
list of recipients, or your kill character to erase them altogether. Thus, for example, if your 
erase and kill characters are the standard # and @ symbols, 

“h 

To: root kurt####bill 
would change the initial recipients “root kurt” to “root bill.”” When you type a newline, Mail 
advances to the “Subject:” field, where the same rules apply. Another newline brings you to 
the ‘“‘Cc:” field, which may be edited in the same fashion. Another newline leaves you append- 
ing text to the end of your message. You can use “p to print the current text of the header 
fields and the body of the message. 


To effect a temporary escape to the shell, the escape 
“lcommand 


is used, which executes command and returns you to mailing mode without altering the text of 
your message. If you wish, instead, to filter the body of your message through a shell com- 
mand, then you can use 


4tcommand 
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which pipes your message through the command and uses the output as the new text of your 
message. If the command produces no output, Mail assumes that something is amiss and 
retains the old version of your message. A frequently-used filter is the command fmt, 
designed to format outgoing mail. 


To effect a temporary escape to Mail command mode instead, you can use the 
~:Mail command 


escape. This is especially useful for retyping the message you are replying to, using, for exam- 
ple: 


“st 
It is also useful for setting options and modifying aliases. 


If you wish (for some reason) to send a message that contains a line beginning with a 
tilde, you must double it. Thus, for example, 


“This line begins with a tilde. 
sends the line 
“This line begins with a tilde. 


Finally, the escape 
“9 


prints out a brief summary of the available tilde escapes. 


On some terminals (particularly ones with no lower case) tilde’s are difficult to type. 
Mail allows you to change the escape character with the “escape” option. For example, I set 


set escape=] 


and use a right bracket instead of a tilde. If I ever need to send a line beginning with right 
bracket, I double it, just as for ~. Changing the escape character removes the special meaning 
of ~. 


4.2. Network access 


This section describes how to send mail to people on other machines. Recall that send- 
ing to a plain login name sends mail to that person on your machine. If your machine is 
directly (or sometimes, even, indirectly) connected to the Arpanet, you can send messages to 
people on the Arpanet using a name of the form 


name@host 


where name is the login name of the person you’re trying to reach and host is the name of the 
machine where he logs in on the Afpanet. 


If your recipient logs in on a machine connected to yours by UUCP (the Bell Labora- 
tories supplied network that communicates over telephone lines), sending mail to him is a bit 
more complicated. You must know the list of machines through which your message must 
travel to arrive at his site. So, if his machine is directly connected to yours, you can send mail 
to him using the syntax: 


host!name 


where, again, host is the name of his machine and name is his login name. If your message 
must go through an intermediate machine first, you must use the syntax: 


intermediate!host!name 


and so on. It is actually a feature of UUCP that the map of all the systems in the network is 
not known anywhere (except where people decide to write it down for convenience). Talk to 
your system administrator about the machines connected to your site. 
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If you want to send a message to a recipient on the Berkeley network (Berknet), you use 
the syntax: 


host:name 


where host is his machine name and name is his login name. Unlike UUCP, you need not 
know the names of the intermediate machines. 


When you use the reply command to respond to a letter, there is a problem of figuring 
out the names of the users in the “To:” and “Cc:” lists relative to the current machine. If the 
original letter was sent to you by someone on the local machine, then this problem does not 
exist, but if the message came from a remote machine, the problem must be dealt with. Mail 
uses a heuristic to build the correct name for each user relative to the local machine. So, 
when you reply to remote mail, the names in the “To:” and “Cc:” lists may change some- 
what. 


4.3. Special recipients 


As described previously, you can send mail to either user names or alias names. It is 
also possible to send messages directly to files or to programs, using special conventions. If a 
recipient name has a ‘/’ in it or begins with a ‘+’, it is assumed to be the path name of a file 
into which to send the message. If the file already exists, the message is appended to the end 
of the file. If you want to name a file in your current directory (ie, one for which a ‘/’ would 
not usually be needed) you can precede the name with ‘./’ So, to send mail to the file ‘““memo” 
in the current directory, you can give the command: 


% Mail ./memo 


If the name begins with a ‘+,’ it is expanded into the full path name of the folder name in 
your folder directory. This ability to send mail to files can be used for a variety of purposes, 
such as maintaining a journal and keeping a record of mail sent to a certain group of users. 
The second example can be done automatically by including the full pathname of the record 
file in the alias command for the group. Using our previous alias example, you might give 
the command: 


alias project sam sally steve susan /usr/project/mail record 


Then, all mail sent to ”project” would be saved on the file ‘‘/usr/project/mail record” as well 
as being sent to the members of the project. This file can be examined using Mail —f. 


It is sometimes useful to send mail directly to a program, for example one might write a 
project billboard program and want to access it using Mail. To send messages to the billboard 
program, one can send mail to the special name billboard’ for example. Mail treats recipient 
names that begin with a Pasa program to send the mail to. An alias can be set up to refer- 
ence a ‘? prefaced name if desired. Caveats: the shell treats specially, so it must be quoted 
on the command line. Also, the 4 program’ must be presented as a single argument to mail. 
The safest course is to surround the entire name with double quotes. This also applies to 
usage in the alias command. For example, if we wanted to alias ‘rmsgs’ to ‘rmsgs —s’ we 
would need to say: 


alias rmsgs »| rmsgs -s” 
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5. Additional features 


This section describes some additional commands of use for reading your mail, setting 
options, and handling lists of messages. 


5.1. Message lists 


Several Mail commands accept a list of messages as an argument. Along with type and 
delete, described in section 2, there is the from command, which prints the message headers 
associated with the message list passed to it. The from command is particularly useful in 
conjunction with some of the message list features described below. 


A message list consists of a list of message numbers, ranges, and names, separated by 
spaces or tabs. Message numbers may be either decimal numbers, which directly specify mes- 
sages, or one of the special characters “‘{}” “.” or “$” to specify the first relevant, current, or 
last relevant message, respectively. Relevant here means, for most commands “not deleted” 
and “deleted” for the undelete command. 


A range of messages consists of two message numbers (of the form described in the pre- 
vious paragraph) separated by a dash. Thus, to print the first four messages, use 


type 1-4 
and to print all the messages from the current message to the last message, use 
type .—$ 
A name is a user name. The user names given in the message list are collected together 
and each message selected by other means is checked to make sure it was sent by one of the 
named users. If the message consists entirely of user names, then every message sent by one 


those users that is relevant (in the sense described earlier) is selected. Thus, to print every 
message sent to you by “root,” do 


type root 


As a shorthand notation, you can specify simply “*” to get every relevant (same sense) 
message. Thus, 


type * 

prints all undeleted messages, 
delete * 

deletes all undeleted messages, and 
undelete * 

undeletes all deleted messages. 


You can search for the presence of a word in subject lines with /. For example, to print 
the headers of all messages that contain the word “PASCAL,” do: 


from /pascal 


Note that subject searching ignores upper/lower case differences. 


5.2. List of commands 
This section describes all the Mail commands available when receiving mail. 
! Used to preface a command to be executed by the shell. 


— The — command goes to the previous message and prints it. The — command may be 
given a decimal number n as an argument, in which case the nth previous message is 
gone to and printed. 
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Print 


Like print, but also print out ignored header fields. See also print and ignore. 


Reply 


Note the capital R in the name. Frame a reply to a one or more messages. The reply 
(or replies if you are using this on multiple messages) will be sent ONLY to the person 
who sent you the message (respectively, the set of people who sent the messages you are 
replying to). You can add people using the “t and “ec tilde escapes. The subject in your 
reply is formed by prefacing the subject in the original message with “Re:” unless it 
already began thus. If the original message included a “reply-to” header field, the reply 
will go only to the recipient named by “reply-to.” You type in your message using the 
same conventions available to you through the mail command. The Reply command is 
especially useful for replying to messages that were sent to enormous distribution groups 
when you really just want to send a message to the originator. Use it often. 


Type 


Identical to the Print command. 


alias 


Define a name to stand for a set of other names. This is used when you want to send 
messages to a certain group of people and want to avoid retyping their names. For 
example 


alias project john sue willie kathryn 


creates an alias project which expands to the four people John, Sue, Willie, and 
Kathryn. 


alternates 


If you have accounts on several machines, you may find it convenient to use the 
/usr/lib/aliases on all the machines except one to direct your mail to a single account. 
The alternates command is used to inform Mail that each of these other addresses is 
really you. Alternates takes a list of user names and remembers that they are all actu- 
ally you. When you reply to messages that were sent to one of these alternate names, 
Mail will not bother to send a copy of the message to this other address (which would 
simply be directed back to you by the alias mechanism). If alternates is given no argu- 
ment, it lists the current set of alternate names. Alternates is usually used in the 
-mailrc file. 


chdir 


The chdir command allows you to change your current directory. Chdir takes a single 
argument, which is taken to be the pathname of the directory to change to. If no argu- 
ment is given, chdir changes to your home directory. 


copyThe copy command does the same thing that save does, except that it does not mark 


the messages it is used on for deletion when you quit. 


delete 


dt 


edit 


Deletes a list of messages. Deleted messages can be reclaimed with the undelete com- 
mand. 


The dt command deletes the current message and prints the next message. It is useful 
for quickly reading and disposing of mail. 


To edit individual messages using the text editor, the edit command is provided. The 
edit command takes a list of messages as described under the type command and 
processes each by writing it into the file Messagex where x is the message number being 
edited and executing the text editor on it. When you have edited the message to your 
satisfaction, write the message out and quit, upon which Mail will read the message back 
and remove the file. Edit may be abbreviated to e. 
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else Marks the end of the then-part of an if statement and the beginning of the part to take 
effect if the condition of the if statement is false. 


endif 
Marks the end of an if statement. 


exit Leave Mail without updating the system mailbox or the file your were reading. Thus, if 
you accidentally delete several messages, you can use exit to avoid scrambling your 
mailbox. . 


file The same as folder. 


folders 
List the names of the folders in your folder directory. 


folder 
The folder command switches to a new mail file or folder. With no arguments, it tells 
you which file you are currently reading. If you give it an argument, it will write out 
changes (such as deletions) you have made in the current file and read the new file. 
Some special conventions are recognized for the name: 


-Name............_........ Meaning... 
# Previous file read 
% Your system mailbox 
% name Name’s system mailbox 
& Your ~/mbox file 
+folder A file in your folder directory 
from 
The from command takes a list of messages and prints out the header lines for each 
one; hence 


from joe 
is the easy way to display all the message headers from “joe.” 


headers 

When you start up Mail to read your mail, it lists the message headers that you have. 
These headers tell you who each message is from, when they were sent, how many lines 
and characters each message is, and the “Subject:” header field of each message, if 
present. In addition, Mail tags the message header of each message that has been the 
object of the preserve command with a “P.” Messages that have been saved or writ- 
ten are flagged with a “*.” Finally, deleted messages are not printed at all. If you wish 
to reprint the current list of message headers, you can do so with the headers com- 
mand. The headers command (and thus the initial header listing) only lists the first so 
many message headers. The number of headers listed depends on the speed of your ter- 
minal. This can be overridden by specifying the number of headers you want with the 
window option. Mail maintains a notion of the current “window” into your messages for 
the purposes of printing headers. Use the z command to move forward and back a win- 
dow. You can move Mail’s notion of the current window directly to a particular message 
by using, for example, 


headers 40 


to move Mail’s attention to the messages around message 40. The headers command 
can be abbreviated to h. 


help Print a brief and usually out of date help message about the commands in Mail. Refer 
to this manual instead. 
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hold Arrange to hold a list of messages in the system mailbox, instead of moving them to the 
file mbox in your home directory. If you set the binary option hold, this will happen by 
default. 


if Commands in your “.mailrc” file can be executed conditionally depending on whether 
you are sending or receiving mail with the if command. For example, you can do: 


6 


if receive 
commands... 
endif 


An else form is also available: 


if send 

commands... 
else 

commands... 
endif 


Note that the only allowed conditions are receive and send. 


ignore 
Add the list of header fields named to the ignore list. Header fields in the ignore list are 
not printed on your terminal when you print a message. This allows you to suppress 
printing of certain machine-generated header fields, such as Via which are not usually of 
interest. The Type and Print commands can be used to print a message in its entirety, 
including ignored fields. If ignore is executed with no arguments, it lists the current set 
of ignored fields. 


list List the vaild Mail commands. 


mailSend mail to one or more people. If you have the ask option set, Mail will prompt you 
for a subject to your message. Then you can type in your message, using tilde escapes as 
described in section 4 to edit, print, or modify your message. To signal your satisfaction 
with the message and send it, type control-d at the beginning of a line, or a. alone on a 
line if you set the option dot. To abort the message, type two interrupt characters 
(RUBOUT by default) in a row or use the “q escape. 


mbox 
Indicate that a list of messages be sent to mbox in your home directory when you quit. 
This is the default action for messages if you do not have the hold option set. 


next The next command goes to the next message and types it. If given a message list, next 
goes to the first such message and types it. Thus, 


next root 


goes to the next message sent by “root” and types it. The next command can be abbre- 
viated to simply a newline, which means that one can go to and type a message by sim- 
ply giving its message number or one of the magic characters “ft” “‘.” or “$”. Thus, 


prints the current message and 
4 
prints message 4, as described previously. 


preserve 
Same as hold. Cause a list of messages to be held in your system mailbox when you 
quit. 

quit Leave Mail and update the file, folder, or system mailbox your were reading. Messages 
that you have examined are marked as “read” and messages that existed when you 
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started are marked as “old.” If you were editing your system mailbox and if you have set 
the binary option hold, all messages which have not been deleted, saved, or mboxed will 
be retained in your system mailbox. If you were editing your system mailbox and you 
did not have hold set, all messages which have not been deleted, saved, or preserved will 
be moved to the file mbox in ycur home directory. 


reply 

Frame a reply to a single message. The reply will be sent to the person who sent you the 
message to which you are replying, plus all the people who received the original message, 
except you. You can add people using the “t and “ce tilde escapes. The subject in your 
reply is formed by prefacing the subject in the original message with “Re:” unless it 
already began thus. If the original message included a “reply-to” header field, the reply 
will go only to the recipient named by “reply-to.” You type in your message using the 
same conventions available to you through the mail command. 


savelt is often useful to be able to save messages on related topics in a file. The save com- 
mand gives you ability to do this. The save command takes as argument a lit of mes- 
sage numbers, followed by the name of the file on which to save the messages. The mes- 
sages are appended to the named file, thus allowing one to keep several messages in the 
file, stored in the order they were put there. The save command can be abbreviated to 
s. An example of the save command relative to our running example is: 


s 1 2 tuitionmail 


Saved messages are not automatically saved in mbox at quit time, nor are they selected 
by the next command described above, unless explicitly specified. 


set Set an option or give an option a value. Used to customize Mail. Section 5.3 contains a 
list of the options. Options can be binary, in which case they are on or off, or valued. 
To set a binary option option on, do 


set option 
To give the valued option option the value value, do 
set option=value 
Several options can be specified in a single set command. 


shell 
The shell command allows you to escape to the shell. Shell invokes an interactive shell 
and allows you to type commands to it. When you leave the shell, you will return to 
Mail. The shell used is a default assumed by Mail; you can override this default by set- 
ting the valued option “SHELL,” eg: 


set SHELL=/bin/csh 
source 


The source command reads Mail commands from a file. It is useful when you are try- 
ing to fix your “.mailrc” file and you need to re-read it. 


top The top command takes a message list and prints the first five lines of each addressed 
message. It may be abbreviated to to. If you wish, you can change the number of lines 
that top prints out by setting the valued option “toplines.” On a CRT terminal, 


set toplines=10 
might be preferred. 


type Print a list of messages on your terminal. If you have set the option crt to a number and 
the total number of lines in the messages you are printing exceed that specified by crt, 
the messages will be printed by a terminal paging program such as more. 
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undelete 
The undelete command causes a message that had been deleted previously to regain its 
initial status. Only messages that have been deleted may be undeleted. This command 
may be abbreviated to u. 

unset 
Reverse the action of setting a binary or valued option. 

visual 
It is often useful to be able to invoke one of two editors, based on the type of terminal 
one is using. To invoke a display oriented editor, you can use the visual command. 
The operation of the visual command is otherwise identical to that of the edit com- 
mand. 


Both the edit and visual commands assume some default text editors. These default 
editors can be overridden by the valued options “EDITOR” and “VISUAL” for the stan- 
dard and screen editors. You might want to do: 


set EDITOR=/usr/ucb/ex VISUAL=/usr/ucb/vi 


write 
The save command always writes the entire message, including the headers, into the 
file. If you want to write just the message itself, you can use the write command. The 
write command has the same syntax as the save command, and can be abbreviated to 
simply w. Thus, we could write the second message by doing: 


w 2 file.c 


As suggested by this example, the write command is useful for such tasks as sending 
and receiving source program text over the message system. 


Z Mail presents message headers in windowfuls as described under the headers com- 
mand. You can move Mail’s attention forward to the next window by giving the 


Zt 
command. Analogously, you can move to the previous window with: 


Y Acad 


5.3. Custom options 


Throughout this manual, we have seen examples of binary and valued options. This sec- 
tion describes each of the options in alphabetical order, including some that you have not seen 
yet. To avoid confusion, please note that the options are either all lower case letters or all 
upper case letters. When I start a sentence such as: “Ask” causes Mail to prompt you for a 
subject header, I am only capitalizing “ask” as a courtesy to English. 


EDITOR 
The valued option “EDITOR” defines the pathname of the text editor to be used in the 
edit command and “e. If not defined, a standard editor is used. 


SHELL 
The valued option “SHELL” gives the path name of your shell. This shell is used for 
the ! command and *! escape. In addition, this shell expands file names with shell meta- 
characters like * and ? in them. 


VISUAL 
The valued option “VISUAL” defines the pathname of your screen editor for use in the 
visual command and “v escape. A standard screen editor is used if you do not define 
one. 
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append 
The “append” option is binary and causes messages saved in mbox to be appended to 
the end rather than prepended. Normally, Mailwill mbox in the same order that the 
system puts messages in your system mailbox. By setting “append,” you are requesting 
that mbox be appended to regardless. It is in any event quicker to append. 


ask “Ask” is a binary option which causes Mail to prompt you for the subject of each mes- 
sage you send. If you respond with simply a newline, no subject field will be sent. 


askce 
“Askcc” is a binary option which causes you to be prompted for additional carbon copy 
recipients at the end of each message. Responding with a newline shows your satisfac- 
tion with the current list. 


autoprint 
“Autoprint” is a binary option which causes the delete command to behave like dp — 
thus, after deleting a message, the next one will be typed automatically. This is useful 
to quickly scanning and deleting messages in your mailbox. 


debug 
The binary option “debug” causes debugging information to be displayed. Use of this 
option is the same as useing the—d command line flag. 


dot “Dot” is a binary option which, if set, causes Mail to interpret a period alone on a line as 
the terminator of a message you are sending. 


escape 
To allow you to change the escape character used when sending mail, you can set the 
valued option “escape.” Only the first character of the “escape” option is used, and it 
must be doubled if it is to appear as the first character of a line of your message. If you 
change your escape character, then ~ loses all its special meaning, and need no longer be 
doubled at the beginning of a line. 


folder 
The name of the directory to use for storing folders of messages. If this name begins 
with a ‘/’ Mail considers it to be an absolute pathname; otherwise, the folder directory is 
found relative to your home directory. 


hold The binary option “hold” causes messages that have been read but not manually dealt 
with to be held in the system mailbox. This prevents such messages from being automat- 
ically swept into your mbox. 


ignore 
The binary option “ignore” causes RUBOUT characters from your terminal to be ignored 
and echoed as @’s while you are sending mail. RUBOUT characters retain their original 
meaning in Mail command mode. Setting the “ignore” option is equivalent to supplying 
the —i flag on the command line as described in section 6. 


ignoreeof 
An option related to “dot” is “ignoreeof” which makes Mail refuse to accept a control—d 
as the end of a message. “Ignoreeof” also applies to Mail command mode. 


keep 
The “keep” option causes Mail to truncate your system mailbox instead of deleting it 
when it is empty. This is useful if you elect to protect your mailbox, which you would 
do with the shell command: 


chmod 600 /usr/spool/mail/yourname 


where yourname is your login name. If you do not do this, anyone can probably read 
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your mail, although people usually don’t. 


keepsave 
When you save a message, Mail usually discards it when you quit. To retain all saved 
messages, set the “keepsave” option. 


metoo 
When sending mail to an alias, Mail makes sure that if you are included in the alias, 
that mail will not be sent to you. This is useful if a single alias is being used by all 
members of the group. If however, you wish to receive a copy of all the messages you 
send to the alias, you can set the binary option “metoo.” 


noheader 
The binary option ‘“noheader” suppresses the printing of the version and headers when 
Mail is first invoked. Setting this option is the same as using —N on the command line. 


nosave 
Normally, when you abort a message with two RUBOUTs, Mail copies the partial letter to 
the file “dead.letter” in your home directory. Setting the binary option ‘“nosave” 
prevents this. 


quiet 
The binary option “quiet” suppresses the printing of the version when Mail is first 
invoked, as well as printing the for example “Message 4:” from the type command. 


record 
If you love to keep records, then the valued option “record” can be set to the name of a 
file to save your outgoing mail. Each new message you send is appended to the end of 
the file. 


screen 
When Mail initially prints the message headers, it determines the number to print by 
looking at the speed of your terminal. The faster your terminal, the more it prints. The 
valued option “screen” overrides this calculation and specifies how many message 
headers you want printed. This number is also used for scrolling with the z command. 


sendmail 
To alternate delivery system, set the “sendmail” option to the full pathname of the pro- 
gram to use. Note: this is not for everyone! Most people should use the default 
delivery system. 


toplines 
The valued option “toplines” defines the number of lines that the “top” command will 
print out instead of the default five lines. 


verbose 
The binary option ”verbose” causes Mail to invoke sendmail with the —v flag, which 
causes it to go into versbose mode and announce expansion of aliases, etc. Setting the 
*verbose” option is equivalent to invoking Mail with the —v flag as described in section 
6. 
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6. Command line options 

This section describes command line options for Mail and what they are used for. 
—-N Suppress the initial printing of headers. 
-—d Turn on debugging information. Not of general interest. 


—f file 
Show the messages in file instead of your system mailbox. If file is omitted, Mail reads 
mbox in your home directory. 


-i Ignore tty interrupt signals. Useful on noisy phone lines, which generate spurious 
RUBOUT or DELETE characters. It’s usually more effective to change your interrupt 
character to control—c, for which see the stty shell command. 


-n Inhibit reading of /usr/lib/Mail.rc. Not generally useful, since /usr/lib/Mail.rc is usually 
empty. 


—s string 
Used for sending mail. String is used as the subject of the message being composed. If 
string contains blanks, you must surround it with quote marks. 


—u name 
Read names’s mail instead of your own. Unwitting others often neglect to protect their 
mailboxes, but discretion is advised. Essentially, -u user is a shorthand way of doing 
—f /usr/spool/user. 


—v Use the —v flag when invoking sendmail. This feature may also be enabled by setting 
the the option ”verbose”. 


The following command line flags are also recognized, but are intended for use by pro- 
grams invoking Mail and not for people. 


—T file 
Arrange to print on file the contents of the article-id fields of all messages that were 
either read or deleted. —T is for the readnews program and should NOT be used for 
reading your mail. 


—h number 
Pass on hop count information. Mail will take the number, increment it, and pass it 
with —h to the mail delivery system. —h only has effect when sending mail and is used 
for network mail forwarding. 


—r name 
Used for network mail forwarding: interpret name as the sender of the message. The 
name and —r are simply sent along to the mail delivery system. Also, Mail will wait for 
the message to be sent and return the exit status. Also restricts formatting of message. 


Note that —h and —r, which are for network mail forwarding, are not used in practice 
since mail forwarding is now handled separately. They may disappear soon. 
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7. Format of messages 


This section describes the format of messages. Messages begin with a from line, which 
consists of the word “From” followed by a user name, followed by anything, followed by a 
date in the format returned by the ctime library routine described in section 3 of the Unix 
Programmer’s Manual. A possible ctime format date is: 


Tue Dec 1 10:58:23 1981 


The ctime date may be optionally followed by a single space and a time zone indication, which 
should be three capital letters, such as PDT. 


Following the from line are zero or more header field lines. Each header field line is of 
the form: 


name: information 


Name can be anything, but only certain header fields are recognized as having any meaning. 
The recognized header fields are: article-id, bcc, cc, from, reply-to, sender, subject, and to. 
Other header fields are also significant to other systems; see, for example, the current Arpanet 
message standard for much more on this topic. A header field can be continued onto follow- 
ing lines by making the first character on the following line a space or tab character. 


If any headers are present, they must be followed by a blank line. The part that follows 
is called the body of the message, and must be ASCII text, not containing null characters. 
Each line in the message body must be terminated with an ASCII newline character and no 
line may be longer than 512 characters. If binary data must be passed through the mail sys- 
tem, it is suggested that this data be encoded in a system which encodes six bits into a print- 
able character. For example, one could use the upper and lower case letters, the digits, and 
the characters comma and period to make up the 64 characters. Then, one can send a 16-bit 
binary number as three characters. These characters should be packed into lines, preferably 
lines about 70 characters long as long lines are transmitted more efficiently. 


The message delivery system always adds a blank line to the end of each message. This 
blank line must not be deleted. 


The UUCP message delivery system sometimes adds a blank line to the end of a message 
each time it is forwarded through a machine. 


It should be noted that some network transport protocols enforce limits to the lengths of 
messages. 
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8. Glossary 
This section contains the definitions of a few phrases peculiar to Mail. 
alias An alternative name for a person or list of people. 


flag An option, given on the command line of Mail, prefaced with a —. For example, —f is a 
flag. 

header field 
At the beginning of a message, a line which contains information that is part of the 
structure of the message. Popular header fields include to, cc, and subject. 

mail 
A collection of messages. Often used in the phrase, “Have you read your mail?” 


mailbox 
The place where your mail is stored, typically in the directory /usr/spool/mail. 


message 
A single letter from someone, initially stored in your mailbox. 


message list 
A string used in Mail command mode to describe a sequence of messages. 


option 
A piece of special purpose information used to tailor Mail to your taste. Options are 
specified with the set command. 
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9. Summary of commands, options, and escapes 

This section gives a quick summary of the Mail commands, binary and valued options, 
and tilde escapes. 

The following table describes the commands: 


MOTO TAIN 2 wget tess es Aree ened DDOSCPEE NOI x oo I oo oe ances 


! 


Single command escape to shell 
Back up to previous message 


Print Type message with ignored fields 

Reply Reply to author of message only 

Type Type message with ignored fields 

alias Define an alias as a set of user names 
alternates List other names you are known by 

chdir Change working directory, home by default 
copy Copy a message to a file or folder 

delete Delete a list of messages 

dt Delete current message, type next message 
endif End of conditional statement; see if 

edit Edit a list of messages 

else Start of else part of conditional; see if 

exit Leave mail without changing anything 

file Interrogate/change current mail file 

folder Same as file 

folders List the folders in your folder directory 

from List headers of a list of messages 

headers List current window of messages 

help Print brief summary of Mail commands 

hold Same as preserve 

if Conditional execution of Mail commands 
ignore Set/examine list of ignored header fields 

list List valid Mail commands 

local List other names for the local host 

mail Send mail to specified names 

mbox Arrange to save a list of messages in mbox 

next Go to next message and type it 

preserve Arrange to leave list of messages in system mailbox 
quit Leave Mail; update system mailbox, mbox as appropriate 
reply Compose a reply to a message 

save Append messages, headers included, on a file 
set Set binary or valued options 

shell Invoke an interactive shell 

top Print first so many (5 by default) lines of list of messages 
type Print messages 

undelete Undelete list of messages 

unset Undo the operation of a set 

visual Invoke visual editor on a list of messages 

write Append messages to a file, don’t include headers 
Z Scroll to next/previous screenful of headers 


2-40 Mail Reference Manual 


The following table describes the options. Each option is shown as being either a 
binary or valued option. 


_Option —__ Li 0 (een eT NNER EIR 82116) 14 01110) | ee nT ee 
EDITOR valued  Pathname of editor for “e and edit 


SHELL valued  Pathname of shell for shell, ~! and! 
VISUAL valued Pathname of screen editor for “v, visual 





append binary Always append messages to end of mbox 

ask binary Prompt user for Subject: field when sending 
askcc binary Prompt user for additional Cc’s at end of message 
autoprint binary Print next message after delete 

crt valued Minimum number of lines before using more 
debug binary Print out debugging information 

dot binary Accept. alone on line to terminate message input 
escape valued __ Escape character to be used instead of ~ 

folder valued __ Directory to store folders in 

hold binary Hold messages in system mailbox by default 
ignore binary Ignore RUBOUT while sending mail 

ignoreeof binary Don’t terminate letters/command input with {}D 
keep binary Don’t unlink system mailbox when empty 
keepsave binary’ Don’t delete saved messages by default 

metoo binary Include sending user in aliases 

noheader’ binary Suppress initial printing of version and headers 
nosave binary Don’t save partial letter in dead.letter 

quiet binary Suppress printing of Mail version and message numbers 
record valued ___ File to save all outgoing mail in 

screen valued Size of window of message headers for z, etc. 
sendmail valued Choose alternate mail delivery system 

toplines valued Number of lines to print in top 

verbose binary Invoke sendmail with the —v flag 


The following table summarizes the tilde escapes available while sending mail. 





Escape Arguments. Description _ 

i command Execute shell command 

*¢c name ... Add names to Cc: field 

“d Read dead.letter into message 

“e Invoke text editor on partial message 
“f messages Read named messages 

“h Edit the header fields 

“m messages Read named messages, right shift by tab 
“p Print message entered so far 

“q Abort entry of letter; like RUBOUT 

iB filename Read file into message 

“s string Set Subject: field to string 

“t name ... Add names to To: field 

“v Invoke screen editor on message 

“w filename Write message on file 

| command Pipe message through command 


string Quote a ~ in front of string 
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The following table shows the command line flags that Mail accepts: 


Fi; Deserinti 


-N 

—T file 
—d 

-f file 
—h number 
—i 

—n 

—r name 
—s string 
—u name 
—Vv 


Suppress the initial printing of headers 
Article-id’s of read/deleted messages to file 
Turn on debugging 

Show messages in file or ~/mbox 

Pass on hop count for mail forwarding 
Ignore tty interrupt signals 

Inhibit reading of /usr/lib/Mail.rc 

Pass on name for mail forwarding 

Use string as subject in outgoing mail 
Read name’s mail instead of your own 
Invoke sendmail with the —v flag 


Notes: -T, —d, —h, and —r are not for human use. 


10. Conclusion 


Mail is an attempt to provide a simple user interface to a variety of underlying message 
systems. Thanks are due to the many users who contributed ideas and testing to Mail. 
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BC — An Arbitrary Precision Desk-Calculator Language 


Lorinda Cherry 
Robert Morris 


Bell Laboratories 
Murray Hill, New Jersey 07974 


Introduction 


BC is a language and a compiler for doing arbitrary precision arithmetic on the UNIXT 
time-sharing system [1]. The compiler was written to make conveniently available a collection 
of routines (called DC [5]) which are capable of doing arithmetic on integers of arbitrary size. 
The compiler is by no means intended to provide a complete programming language. It is a 
minimal language facility. 

There is a scaling provision that permits the use of decimal point notation. Provision is 
made for input and output in bases other than decimal. Numbers can be converted from 
decimal to octal by simply setting the output base to equal 8. 


The actual limit on the number of digits that can be handled depends on the amount of 
storage available on the machine. Manipulation of numbers with many hundreds of digits is 
possible even on the smallest versions of UNIX. 


The syntax of BC has been deliberately selected to agree substantially with the C 
language [2]. Those who are familiar with C will find few surprises in this language. 


Simple Computations with Integers 


The simplest kind of statement is an arithmetic expression on a line by itself. For 
instance, if you type in the line: 


142857 + 285714 
the program responds immediately with the line 
428571 


The operators —, *, /, %, and ~ can also be used; they indicate subtraction, multiplication, 
division, remaindering, and exponentiation, respectively. Division of integers produces an 
integer result truncated toward zero. Division by zero produces an error comment. 


Any term in an expression may be prefixed by a minus sign to indicate that it is to be 
negated (the ‘unary’ minus sign). The expression 


7+-3 
is interpreted to mean that —3 is to be added to 7. 


More complex expressions with several operators and with parentheses are interpreted 
just as in Fortran, with * having the greatest binding power, then * and % and /, and finally + 
and —. Contents of parentheses are evaluated before material outside the parentheses. 
Exponentiations are performed from right to left and the other operators from left to right. 
The two expressions 


+ UNIX is a trademark of Bell Laboratories. 
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a’b’c and a‘(b‘c) 
are equivalent, as are the two expressions 
a*b*c and (a*b)*c 
BC shares with Fortran and C the undesirable convention that 
a/b*c is equivalent to (a/b)*c 
Internal storage registers to hold numbers have single lower-case letter names. The 
value of an expression can be assigned to a register in the usual way. The statement 
x=x+3 . | | 
has the effect of increasing by three the value of the contents of the register named x. When, 


as in this case, the outermost operator is an =, the assignment is performed but the result is 
not printed. Only 26 of these named storage souisters are available. 


There is a built-in square root function whose result is truncated to an integer (but see 
scaling below). The lines. 


x = sqrt(191) 
x 


produce the printed result 
13 | 


Bases 


There are special internal quantities, called ‘ibase’ and ‘obase’. The contents of ‘ibase’, 
initially set to 10, determines the base used for interpreting numbers read in. For example, 
the lines 


ibase = 8 
11 


will produce the output line 
: . 


and you are all set up to do octal to decimal conversions. Beware, however of trying to change 
the input base back to decimal by typing 


ibase = 10 


Because the number 10 is interpreted as octal, this statement will have no effect. For those 
who deal in hexadecimal notation, the characters A~F are permitted in numbers (no matter 
what base is in effect) and are interpreted as digits having values 10—15 respectively. The 
statement - 


ibase =A 


will change you back to decimal input base no matter what the current input base is. Nega- 

tive and large positive input bases are permitted but useless. No mechanism has been pro- 

vided for the input of arbitrary numbers in bases less than 1 and greater than 16. 

| The contents of ‘obase’, initially set to 10, are used as the base for output numbers. The 

lines e ie . . 
obase = 16 
1000 


will produce the output line 
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3K8 


which is to be interpreted as a 3-digit hexadecimal number. Very large output bases are per- 
mitted, and they are sometimes useful. For example, large numbers can be output in groups 
of five digits by setting ‘obase’ to 100000. Strange (i.e. 1, 0, or negative) output bases are han- 
dled appropriately. 


Very large numbers are split across lines with 70 characters per line. Lines which are 
continued end with \. Decimal output conversion is practically instantaneous, but output of 
very large numbers (i.e., more than 100 digits) with other bases is rather slow. Non-decimal 
output conversion of a one hundred digit number takes about three seconds. 


It is best to remember that ‘ibase’ and ‘obase’ have no effect whatever on the course of 
internal computation or on the evaluation of expressions, but only affect input and output 
conversion, respectively. 


Scaling 


A third special internal quantity called ‘scale’ is used to determine the scale of calculated 
quantities. Numbers may have up to 99 decimal digits after the decimal point. This frac- 
tional part is retained in further computations. We refer to the number of digits after the 
decimal point of a number as its scale. 


When two scaled numbers are combined by means of one of the arithmetic operations, 
the result has a scale determined by the following rules. For addition and subtraction, the 
scale of the result is the larger of the scales of the two operands. In this case, there is never 
any truncation of the result. For multiplications, the scale of the result is never less than the 
maximum of the two scales of the operands, never more than the sum of the scales of the 
operands and, subject to those two restrictions, the scale of the result is set equal to the con- 
tents of the internal quantity ‘scale’. The scale of a quotient is the contents of the internal 
quantity ‘scale’. The scale of a remainder is the sum of the scales of the quotient and the 
divisor. The result of an exponentiation is scaled as if the implied multiplications were per- 
formed. An exponent must be an integer. The scale of a square root is set to the maximum 
of the scale of the argument and the contents of ‘scale’. 


All of the internal operations are actually carried out in terms of integers, with digits 
being discarded when necessary. In every case where digits are discarded, truncation and not 
rounding is performed. 


The contents of ‘scale’ must be no greater than 99 and no less than 0. It is initially set 
to 0. In case you need more than 99 fraction digits, you may arrange your own scaling. 


The internal quantities ‘scale’, ‘ibase’, and ‘obase’ can be used in expressions just like 
other variables. The line 
scale = scale + 1 
increases the value of ‘scale’ by one, and the line 


scale 


causes the current value of ‘scale’ to be printed. 


The value of ‘scale’ retains its meaning as a number of decimal digits to be retained in 
internal computation even when ‘ibase’ or ‘obase’ are not equal to 10. The internal computa- 
tions (which are still conducted in decimal, regardless of the bases) are performed to the 
specified number of decimal digits, never hexadecimal or octal or any other kind of digits. 


Functions 


The name of a function is a single lower-case letter. Function names are permitted to 
collide with simple variable names. Twenty-six different defined functions are permitted in 
addition to the twenty-six variable names. The line 
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define a(x) { 


begins the definition of a function with one argument. This line must be followed by one or 
more statements, which make up the body of the function, ending with a right brace }. 
Return of control from a function occurs when a return statement is executed or when the end 
of the function is reached. The return statement can take either of the two forms 


return 
return(x) 


In the first case, the value of the function is 0, and in the second, the value of the expression 
in parentheses. 


Variables used in the function can be declared as automatic by a statement of the form 
auto x,y,z 


There can be only one ‘auto’ statement in a function and it must be the first statement in the 
definition. These automatic variables are allocated space and initialized to zero on entry to 
the function and thrown away on return. The values of any variables with the same names 
outside the function are not disturbed. Functions may be called recursively and the automatic 
variables at each level of call are protected. The parameters named in a function definition 
are treated in the same way as the automatic variables of that function with the single excep- 
tion that they are given a value on entry to the function. An example of a function definition 
is 
define a(x,y) { 

auto z 

z= x*y 

return(z) 


} 


The value of this function, when called, will be the product of its two arguments. 


A function is called by the appearance of its name followed by a string of arguments 
enclosed in parentheses and separated by commas. The result is unpredictable if the wrong 
number of arguments is used. 


Functions with no arguments are defined and called using parentheses with nothing 
between them: b(). 


If the function a above has been defined, then the line 
a(7,3.14) 
would cause the result 21.98 to be printed and the line 
x = a(a(3,4),5) 


would cause the value of x to become 60. 


Subscripted Variables 


A single lower-case letter variable name followed by an expression in brackets is called a 
subscripted variable (an array element). The variable name is called the array name and the 
expression in brackets is called the subscript. Only one-dimensional arrays are permitted. 
The names of arrays are permitted to collide with the names of simple variables and function 
names. Any fractional part of a subscript is discarded before use. Subscripts must be greater 
than or equal to zero and less than or equal to 2047. 


Subscripted variables may be freely used in expressions, in function calls, and in return 
statements. 


An array name may be used as an argument to a function, or may be declared as 
automatic in a function definition by the use of empty brackets: 
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f(a[]) 
define f(a[ ]) 
auto al ] 


When an array name is so used, the whole contents of the array are copied for the use of the 
function, and thrown away on exit from the function. Array names which refer to whole 
arrays cannot be used in any other contexts. 


Control Statements 


The ‘if’, the ‘while’, and the ‘for’ statements may be used to alter the flow within pro- 
grams or to cause iteration. The range of each of them is a statement or a compound state- 
ment consisting of a collection of statements enclosed in braces. They are written in the fol- 
lowing way 


if(relation) statement 
while(relation) statement 
for(expression1; relation; expression2) statement 


or 

if(relation) {statements} 

while(relation) {statements} 

for(expression1; relation; expression2) {statements} 

A relation in one of the control statements is an expression of the form 

x>y 
where two expressions are related by one of the six relational operators <, >, <=, >=, ==, or 
!=, The relation == stands for ‘equal to’ and != stands for ‘not equal to’. The meaning of 


the remaining relational operators is clear. 


BEWARE of using = instead of == in a relational. Unfortunately, both of them are 
legal, so you will not get a diagnostic message, but = really will not do a comparison. 


The ‘if’ statement causes execution of its range if and only if the relation is true. Then 
control passes to the next statement in sequence. 


The ‘while’ statement causes execution of its range repeatedly as long as the relation is 
true. The relation is tested before each execution of its range and if the relation is false, con- 
trol passes to the next statement beyond the range of the while. 


The ‘for’ statement begins by executing ‘expressionl’. Then the relation is tested and, if 
true, the statements in the range of the ‘for’ are executed. Then ‘expression2’ is executed. 
The relation is tested, and so on. The typical use of the ‘for’ statement is for a controlled 
iteration, as in the statement 


for(i=1; i<=10; i=i+1) i 
which will print the integers from 1 to 10. Here are some examples of the use of the control 
statements. 


define f(n) { 

auto i, x 

x=1 

for(i=1; i<=n; i=i+1) x=x*i 
return(x) 


} 
The line 
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f(a) 


will print a factorial if a is a positive integer. Here is the definition of a function which will 
compute values of the binomial coefficient (m and n are assumed to be positive integers). 


define b(n,m) { 

auto x, j 

x=] 

for(j=1; j<=m; j=j+1) x=x*(n—-j+1)/j 
return(x) 


} 


The following function computes values of the exponential function by summing the appropri- 
ate series without regard for possible truncation errors: 


scale = 20 
define e(x){ 
auto a, b, c, d, n 


5 ao of 
Il 
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while(1==1) { 


a = a*x 

b = b*n 
c=c+ta/b 
n=nt+l 
if(c==d) return(c) 
d=c 


Some Details 


There are some language features that every user should know about even if he will not 
use them. 


Normally statements are typed one to a line. It is also permissible to type several state- 
ments on a line separated by semicolons. 


If an assignment statement is parenthesized, it then has a value and it can be used any- 
where that an expression can. For example, the line 


(x=y+17) 
not only makes the indicated assignment, but also prints the resulting value. 


Here is an example of a use of the value of an assignment statement even when it is not 
parenthesized. 


x = a{i=it+1] 
causes a value to be assigned to x and also increments i before it is used as a subscript. 


The following constructs work in BC in exactly the same manner as they do in the C 
language. Consult the appendix or the C manuals [2] for their exact workings. 
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x=y=z is the same as x=(y=z) 
x=ty xX = xty 
=-y xX = x-y 
x=*y xX = x*y 
x =/y x =x/y 
x=%y x=x%y 
x="y X = x’y 
xt++ (x=x+1)—1 
x-— (x=x—-1)+1 
+x x = xt+l 
=x x=x-l 


Even if you don’t intend to use the constructs, if you type one inadvertently, something 
correct but unexpected may happen. 


WARNING! In some of these constructions, spaces are significant. There is a real 
difference between x=—y and x= —y. The first replaces x by x—y and the second by —y. 


Three Important Things 

1. To exit a BC program, type ‘quit’. 

2. There is a comment convention identical to that of C and of PL/I. Comments begin 
with ‘/*’ and end with ‘*/’. 

3. There is a library of math functions which may be obtained by typing at command 
level . 


be -1 


This command will load a set of library functions which, at the time of writing, consists of 
sine (named ‘s’), cosine (‘c’), arctangent (‘a’), natural logarithm (‘I’), exponential (‘e’) and 
Bessel functions of integer order (‘j(n,x)’). Doubtless more functions will be added in time. 
The library sets the scale to 20. You can reset it to something else if you like. The design of 
these mathematical library routines is discussed elsewhere [3]. 


If you type 
be file ... 


BC will read and execute the named file or files before accepting commands from the key- 
board. In this way, you may load your favorite programs and function definitions. 
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Appendix 


1. Notation 


In the following pages syntactic categories are in italics; literals are in bold; material in 
brackets [] is optional. 


2. Tokens 


Tokens consist of keywords, identifiers, constants, operators, and separators. Token 
separators may be blanks, tabs or comments. Newline characters or semicolons separate state- 
ments. 


2.1. Comments 
Comments are introduced by the characters /* and terminated by */. 


2.2. Identifiers 


There are three kinds of identifiers — ordinary identifiers, array identifiers and function 
identifiers. All three types consist of single lower-case letters. Array identifiers are followed 
by square brackets, possibly enclosing an expression describing a subscript. Arrays are singly 
dimensioned and may contain up to 2048 elements. Indexing begins at zero so an array may 
be indexed from 0 to 2047. Subscripts are truncated to integers. Function identifiers are fol- 
lowed by parentheses, possibly enclosing arguments. The three types of identifiers do not 
conflict; a program can have a variable named x, an array named x and a function named x, 
all of which are separate and distinct. 


2.3. Keywords 


The following are reserved keywords: 
ibase if 
obase break 
scale define 
sqrt auto 
length return 
while quit 
for 


2.4. Constants 


Constants consist of arbitrarily long numbers with an optional decimal point. The hexa- 
decimal digits A—F are also recognized as digits with values 10—15, respectively. 


3. Expressions 


The value of an expression is printed unless the main operator is an assignment. Pre- 
cedence is the same as the order of presentation here, with highest appearing first. Left or 
right associativity, where applicable, is discussed with each operator. 
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3.1. Primitive expressions 


3.1.1. Named expressions 


Named expressions are places where values are stored. Simply stated, named expres- 
sions are legal on the left side of an assignment. The value of a named expression is the value 
stored in the place named. 


3.1.1.1. identifiers 
Simple identifiers are named expressions. They have an initial value of zero. 


3.1.1.2. array-name[ expression ] 
Array elements are named expressions. They have an initial value of zero. 


3.1.1.3. scale, ibase and obase 


The internal registers scale, ibase and obase are all named expressions. scale is the 
number of digits after the decimal point to be retained in arithmetic operations. scale has an 
initial value of zero. ibase and obase are the input and output number radix respectively. 
Both ibase and obase have initial values of 10. 


3.1.2. Function calls 


3.1.2.1. function-name ([expression[,expression...]]) 


A function call consists of a function name followed by parentheses containing a 
comma-separated list of expressions, which are the function arguments. A whole array passed 
as an argument is specified by the array name followed by empty square brackets. All func- 
tion arguments are passed by value. As a result, changes made to the formal parameters have 
no effect on the actual arguments. If the function terminates by executing a return statement, 
the value of the function is the value of the expression in the parentheses of the return state- 
ment or is zero if no expression is provided or if there is no return statement. 


3.1.2.2. sqrt (expression) 


The result is the square root of the expression. The result is truncated in the least 
significant decimal place. The scale of the result is the scale of the expression or the value of 
scale, whichever is larger. 


3.1.2.3. length (expression ) 


The result is the total number of significant decimal digits in the expression. The scale 
of the result is zero. 


3.1.2.4. scale (expression) 


The result is the scale of the expression. The scale of the result is zero. 


3.1.3. Constants 


Constants are primitive expressions. 


3.1.4. Parentheses 


An expression surrounded by parentheses is a PHEMENE expression. The parentheses are 
used to alter the normal precedence. 
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3.2. Unary operators 
The unary operators bind right to left. 


3.2.1. — expression 
The result is the negative of the expression. 


3.2.2. ++named-expression 


The named expression is incremented by one. The result is the value of the named 
expression after incrementing. 


3.2.3. ——named-expression 


The named expression is decremented by one. The result is the value of the named 
expression after decrementing. 


3.2.4. named-expression ++ 


The named expression is incremented by one. The result is the value of the named 
expression before incrementing. 


3.2.5. named-expression —— 


The named expression is decremented by one. The result is the value of the named 
expression before decrementing. 


3.3. Exponentiation operator 
The exponentiation operator binds right to left. 


3.3.1. expression ~ expression 


The result is the first expression raised to the power of the second expression. The 
second expression must be an integer. If a is the scale of the left expression and b is the abso- 
lute value of the right expression, then the scale of the result is: . 


min (aXb,max(scale,a)) 


3.4. Multiplicative operators 
The operators *, /, % bind left to right. 


3.4.1. expression * expression 


The result is the product of the two expressions. If a and b are the scales of the two 
expressions, then the scale of the result is: 


min (a+b, max(sceale,a,b)) 


3.4.2. expression / expression 


The result is the quotient of the two expressions. The scale of the result is the value of 
scale. 


3.4.3. expression % expression 


The % operator produces the remainder of the division of the two expressions. More 
precisely, a% b is a—a/b*b. 


The scale of the result is the sum of the scale of the divisor and the value of scale 
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3.5. Additive operators 
The additive operators bind left to right. 


3.5.1. expression + expression 


The result is the sum of the two expressions. The scale of the result is the maximun of 
the scales of the expressions. 


3.5.2. expression — expression 


The result is the difference of the two expressions. The scale of the result is the max- 
imum of the scales of the expressions. 


3.6. assignment operators 
The assignment operators bind right to left. 


3.6.1. named-expression = expression 


This expression results in assigning the value of the expression on the right to the named 
expression on the left. 


3.6.2. named-expression =+ expression 
3.6.3. named-expression =— expression 
3.6.4. named-expression =* expression 
3.6.5. named-expression =/ expression 
3.6.6. named-expression =% expression 


3.6.7. named-expression =" expression 


The result of the above expressions is equivalent to “named expression = named expres- 
sion OP expression”, where OP is the operator after the = sign. 


4. Relations 


Unlike all other operators, the relational operators are only valid as the object of an if, 
while, or inside a for statement. 


4.1. expression < expression 
4,2. expression > expression 
4.3. expression <= expression 
4.4. expression >= expression 
4.5. expression == expression 
4.6. expression != expression 


5. Storage classes 


There are only two storage classes in BC, global and automatic (local). Only identifiers 
that are to be local to a function need be declared with the auto command. The arguments 
to a function are local to the function. All other identifiers are assumed to be global and 
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available to all functions. All identifiers, global and local, have initial values of zero. 
Identifiers declared as auto are allocated on entry to the function and released on returning 
from the function. They therefore do not retain values between function calls. auto. arrays 
are specified by the array name followed by empty square brackets. 


Automatic variables in BC do not work in exactly the same way as in either C or PL/I. 
On entry to a function, the old values of the names that appear as parameters and as 
automatic variables are pushed onto a stack. Until return is made from the function, reference 
to these names refers only to the new values. 


6. Statements 


Statements must be separated by semicolon or newline. Except where altered by control 
statements, execution is sequential. 


6.1. Expression statements 


When a statement is an expression, unless the main operator is an assignment, the value 
of the expression is printed, followed by a newline character. 


6.2. Compound statements 


Statements may be grouped together and used when one statement is expected by sur- 
rounding them with { }. 


6.3. Quoted string statements 
"any string” 


This statement prints the string inside the quotes. 


6.4. If statements 


if (relation) statement 


The substatement is executed if the relation is true. 


6.5. While statements 


while (relation ) statement 


The statement is executed while the relation is true. The test occurs before each execu- 
tion of the statement. 


6.6. For statements 


for (expression; relation; expression) statement 


The for statement is the same as 
first-ex pression 
while (relation) { 

statement 

last-expression 


} 


All three expressions must be present. 


6.7. Break statements 


break 
break causes termination of a for or while statement. 
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auto identifier [ identifier | 


The auto statement causes the values of the identifiers to be pushed down. The 
identifiers can be ordinary identifiers or array identifiers. Array identifiers are specified by 
following the array name by empty square brackets. The auto statement must be the first 
statement in a function definition. 


6.9. Define statements 


define( [parameter[,parameter...]]) { 
statements } 


The define statement defines a function. The parameters may be ordinary identifiers or 
array names. Array names must be followed by empty square brackets. 


6.10. Return statements 
return 


return( expression ) 


The return statement causes termination of a function, popping of its auto variables, and 
specifies the result of the function. The first form is equivalent to return(0). The result of 
the function is the result of the expression in parentheses. 


6.11. Quit 


The quit statement stops execution of a BC program and returns control to UNIX when 
it is first encountered. Because it is not treated as an executable statement, it cannot be used 
in a function definition or in an if, for, or while statement. 
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DC — An Interactive Desk Calculator 


Robert Morris 


Lorinda Cherry 


Bell Laboratories 
Murray Hill, New Jersey 07974 


DC is an arbitrary precision arithmetic package implemented on the UNIX? time-sharing 
system in the form of an interactive desk calculator. It works like a stacking calculator using 
reverse Polish notation. Ordinarily DC operates on decimal integers, but one may specify an 
input base, output base, and a number of fractional digits to be maintained. 


A language called BC [1] has been developed which accepts programs written in the fam- 
iliar style of higher-level programming languages and compiles output which is interpreted by 
DC. Some of the commands described below were designed for the compiler interface and are 
not easy for a human user to manipulate. 


Numbers that are typed into DC are put on a push-down stack. DC commands work by 
taking the top number or two off the stack, performing the desired operation, and pushing the 
result on the stack. If an argument is given, input is taken from that file until its end, then 
from the standard input. 


SYNOPTIC DESCRIPTION 


Here we describe the DC commands that are intended for use by people. The additional 
commands that are intended to be invoked by compiled output are described in the detailed 
description. 


Any number of commands are permitted on a line. Blanks and new-line characters are 
ignored except within numbers and in places where a register name is expected. 
The following constructions are recognized: 


number 


The value of the number is pushed onto the main stack. A number is an unbroken 
string of the digits 0-9 and the capital letters A—F which are treated as digits with 
values 10-15 respectively. The number may be preceded by an underscore to input a 
negative number. Numbers may contain decimal points. 


+-*%” 
The top two values on the stack are added (+), subtracted (—), multiplied (*), divided 
(/), remaindered (%), or exponentiated (*). The two entries are popped off the stack; the 
result is pushed on the stack in their place. The result of a division is an integer trun- 
cated toward zero. See the detailed description below for the treatment of numbers with 
decimal points. An exponent must not have any digits after the decimal point. 


+ UNIX is a trademark of Bell Laboratories. 


2-58 DC 


SX 


lx 


The top of the main stack is popped and stored into a register named x, where x may be 
any character. If the s is capitalized, x is treated as a stack and the value is pushed onto 
it. Any character, even blank or new-line, is a valid register name. 


The value in register x is pushed onto the stack. The register x is not altered. If the 1 is 
capitalized, register x is treated as a stack and its top value is popped onto the main 
stack. 


All registers start with empty value which is treated as a zero by the command | and is treated 
as an error by the command L. 


The top value on the stack is duplicated. 


The top value on the stack is printed. The top value remains unchanged. 


All values on the stack and in registers are printed. 


treats the top element of the stack as a character string, removes it from the stack, and 
executes it as a string of DC commands. 


puts the bracketed character string onto the top of the stack. 


exits the program. If executing a string, the recursion level is popped by two. If q is 
capitalized, the top value on the stack is popped and the string execution level is popped 
by that value. 


<x >x =x l<x !>x !=x 


o-. 


The top two elements of the stack are popped and compared. Register x is executed if 
they obey the stated relation. Exclamation point is negation. 


replaces the top element on the stack by its square root. The square root of an integer is 
truncated to an integer. For the treatment of numbers with decimal points, see the 
detailed description below. 


interprets the rest of the line as a UNIX command. Control returns to DC when the 
UNIX command terminates. 


All values on the stack are popped; the stack becomes empty. 
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The top value on the stack is popped and used as the number radix for further input. If 
i is capitalized, the value of the input base is pushed onto the stack. No mechanism has 
been provided for the input of arbitrary numbers in bases less than 1 or greater than 16. 


oO 
The top value on the stack is popped and used as the number radix for further output. 
If o is capitalized, the value of the output base is pushed onto the stack. 

k 
The top of the stack is popped, and that value is used as a scale factor that influences 
the number of decimal places that are maintained during multiplication, division, and 
exponentiation. The scale factor must be greater than or equal to zero and less than 
100. If k is capitalized, the value of the scale factor is pushed onto the stack. 

Z 


The value of the stack level is pushed onto the stack. 


A line of input is taken from the input source (usually the console) and executed. 
DETAILED DESCRIPTION 


Internal Representation of Numbers 


Numbers are stored internally using a dynamic storage allocator. Numbers are kept in 
the form of a string of digits to the base 100 stored one digit per byte (centennial digits). The 
string is stored with the low-order digit at the beginning of the string. For example, the 
representation of 157 is 57,1. After any arithmetic operation on a number, care is taken that 
all digits are in the range 0-99 and that the number has no leading zeros. The number zero is 
represented by the empty string. 


Negative numbers are represented in the 100’s complement notation, which is analogous 
to two’s complement notation for binary numbers. The high order digit of a negative number 
is always —1 and all other digits are in the range 0—99. The digit preceding the high order —1 
digit is never a 99. The representation of —157 is 43,98,-1. We shall call this the canonical 
form of a number. The advantage of this kind of representation of negative numbers is ease 
of addition. When addition is performed digit by digit, the result is formally correct. The 
result need only be modified, if necessary, to put it into canonical form. 


Because the largest valid digit is 99 and the byte can hold numbers twice that large, 
addition can be carried out and the handling of carries done later when that is convenient, as 
it sometimes is. 


An additional byte is stored with each number beyond the high order digit to indicate 
the number of assumed decimal digits after the decimal point. The representation of .001 is 
1,3 where the scale has been italicized to emphasize the fact that it is not the high order digit. 
The value of this extra byte is called the scale factor of the number. 


The Allocator 


DC uses a dynamic string storage allocator for all of its internal storage. All reading and 
writing of numbers internally is done through the allocator. Associated with each string in the 
allocator is a four-word header containing pointers to the beginning of the string, the end of 
the string, the next place to write, and the next place to read. Communication between the 
allocator and DC is done via pointers to these headers. 
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The allocator initially has one large string on a list of free strings. All headers except 
the one pointing to this string are on a list of free headers. Requests for strings are made by 
size. The size of the string actually supplied is the next higher power of 2. When a request 
for a string is made, the allocator first checks the free list to see if there is a string of the 
desired size. If none is found, the allocator finds the next larger free string and splits it 
repeatedly until it has a string of the right size. Left-over strings are put on the free list. If 
there are no larger strings, the allocator tries to coalesce smaller free strings into larger ones. 
Since all strings are the result of splitting large strings, each string has a neighbor that is next 
to it in core and, if free, can be combined with it to make a string twice as long. This is an 
implementation of the ‘buddy system’ of allocation described in [2]. 


Failing to find a string of the proper length after coalescing, the allocator asks the sys- 
tem for more space. The amount of space on the system is the only limitation on the size and 
number of strings in DC. If at any time in the process of trying to allocate a string, the allo- 
cator runs out of headers, it also asks the system for more space. 


There are routines in the allocator for reading, writing, copying, rewinding, forward- 
spacing, and backspacing strings. All string manipulation is done using these routines. 


The reading and writing routines increment the read pointer or write pointer so that the 
characters of a string are read or written in succession by a series of read or write calls. The 
write pointer is interpreted as the end of the information-containing portion of a string and a 
call to read beyond that point returns an end-of-string indication. An attempt to write 
beyond the end of a string causes the allocator to allocate a larger space and then copy the old 
string into the larger block. 


Internal Arithmetic 


All arithmetic operations are done on integers. The operands (or operand) needed for 
the operation are popped from the main stack and their scale factors stripped off. Zeros are 
added or digits removed as necessary to get a properly scaled result from the internal arith- 
metic routine. For example, if the scale of the operands is different and decimal alignment is 
required, as it is for addition, zeros are appended to the operand with the smaller scale. After 
performing the required arithmetic operation, the proper scale factor is appended to the end 
of the number before it is pushed on the stack. . 


A register called scale plays a part in the results of most arithmetic operations. scale 
is the bound on the number of decimal places retained in arithmetic computations. scale 
may be set to the number on the top of the stack truncated to an integer with the k com- 
mand. K may be used to push the value of scale on the stack. scale must be greater than 
or equal to 0 and less than 100. The descriptions of the individual arithmetic operations will 
include the exact effect of scale on the computations. 


Addition and Subtraction 


The scales of the two numbers are compared and trailing zeros are supplied to the 
number with the lower scale to give both numbers the same scale. The number with the 
smaller scale is multiplied by 10 if the difference of the scales is odd. The scale of the result 
is then set to the larger of the scales of the two operands. 


Subtraction is performed by negating the number to be subtracted and proceeding as in 
addition. 


Finally, the addition is performed digit by digit from the low order end of the number. 
The carries are propagated in the usual way. The resulting number is brought into canonical 
form, which may require stripping of leading zeros, or for negative numbers replacing the 
high-order configuration 99,—1 by the digit —1. In any case, digits which are not in the range 
0-99 must be brought into that range, propagating any carries or borrows that result. 
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Multiplication 


The scales are removed from the two operands and saved. The operands are both made 
positive. Then multiplication is performed in a digit by digit manner that exactly mimics the 
hand method of multiplying. The first number is multiplied by each digit of ;the second 
number, beginning with its low order digit. The intermediate products are accumulated into a 
partial sum which becomes the final product. The product is put into the canonical form and 
its sign is computed from the signs of the original operands. 


The scale of the result is set equal to the sum of the scales of the two operands. If that 
scale is larger than the internal register scale and also larger than both of the scales of the 
two operands, then the scale of the result is set equal to the largest of these three last quanti- 
ties. 


Division 
The scales are removed from the two operands. Zeros are appended or digits removed 


from the dividend to make the scale of the result of the integer division equal to the internal 
quantity scale. The signs are removed and saved. 


Division is performed much as it would be done by hand. The difference of the lengths 
of the two numbers is computed. If the divisor is longer than the dividend, zero is returned. 
Otherwise the top digit of the divisor is divided into the top two digits of the dividend. The 
result is used as the first (high-order) digit of the quotient. It may turn out be one unit too 
low, but if it is, the next trial quotient will be larger than 99 and this will be adjusted at the 
end of the process. The trial digit is multiplied by the divisor and the result subtracted from 
the dividend and the process is repeated to get additional quotient digits until the remaining 
dividend is smaller than the divisor. At the end, the digits of the quotient are put into the 
canonical form, with propagation of carry as needed. The sign is set from the sign of the 
operands. . 


Remainder 


The division routine is called and division is performed exactly as described. The quan- 
tity returned is the remains of the dividend at the end of the divide process. Since division 
truncates toward zero, remainders have the same sign as the dividend. The scale of the 
remainder is set to the maximum of the scale of the dividend and the scale of the quotient 
plus the scale of the divisor. 


Square Root 


The scale is stripped from the operand. Zeros are added if necessary to make the integer 
result have a scale that is the larger of the internal quantity scale and the scale of the 
operand. 

The method used to compute sqrt(y) is Newton’s method with successive approximations 
by the rule 

+1 ~ 1/2(X,+) 


n 


The initial guess is found by taking the integer square root of the top two digits. 


Exponentiation 


Only exponents with zero scale factor are handled. If the exponent is zero, then the 
result is 1. If the exponent is negative, then it is made positive and the base is divided into 
one. The scale of the base is removed. 

The integer exponent is viewed as a binary number. The base is repeatedly squared and 


the result is obtained as a product of those powers of the base that correspond to the positions 
of the one-bits in the binary representation of the exponent. Enough digits of the result are 
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removed to make the scale of the result the same as if the indicated multiplication had been 
performed. 


Input Conversion and Base 


Numbers are converted to the internal representation as they are read in. The scale 
stored with a number is simply the number of fractional digits input. Negative numbers are 
indicated by preceding the number with a The hexadecimal digits A—F correspond to the 
numbers 10—15 regardless of input base. The i command can be used to change the base of 
the input numbers. This command pops the stack, truncates the resulting number to an 
integer, and uses it as the input base for all further input. The input base is initialized to 10 
but may, for example be changed to 8 or 16 to do octal or hexadecimal to decimal conversions. 
The command I will push the value of the input base on the stack. 


Output Commands 


The command p causes the top of the stack to be printed. It does not remove the top of 
the stack. All of the stack and internal registers can be output by typing the command f. 
The o command can be used to change the output base. This command uses the top of the 
stack, truncated to an integer as the base for all further output. The output base in initialized 
to 10. It will work correctly for any base. The command O pushes the value of the output 
base on the stack. 


Output Format and Base 


The input and output bases only affect the interpretation of numbers on input and out- 
put; they have no effect on arithmetic computations. Large numbers are output with 70 char- 
acters per line; a \ indicates a continued line. All choices of input and output bases work 
correctly, although not all are useful. A particularly useful output base is 100000, which has 
the effect of grouping digits in fives. Bases of 8 and 16 can be used for decimal-octal or 
decimal-hexadecimal conversions. 


Internal Registers 


Numbers or strings may be stored in internal registers or loaded on the stack from regis- 
ters with the commands s and 1. The command sx pops the top of the stack and stores the 
result in register x. x can be any character. lx puts the contents of register x on the top of 
the stack. The 1 command has no effect on the contents of register x. The s command, how- 
ever, is destructive. 


Stack Commands 


The command ¢ clears the stack. The command d pushes a duplicate of the number on 
the top of the stack on the stack. The command z pushes the stack size on the stack. The 
command X replaces the number on the top of the stack with its scale factor. The command 
Z replaces the top of the stack with its length. 


Subroutine Definitions and Calls 


Enclosing a string in [] pushes the ascii string on the stack. The q command quits or in 
executing a string, pops the recursion levels by two. 


Internal Registers — Programming DC 


The load and store commands together with [] to store strings, x to execute and the 
testing commands ‘<’, ‘>’, ‘=’, ‘!<’, ‘!>’, ‘l=’ can be used to program DC. The x command 
assumes the top of the stack is an string of DC commands and executes it. The testing com- 
mands compare the top two elements on the stack and if the relation holds, execute the regis- 
ter that follows the relation. For example, to print the numbers 0-9, 
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{lip1+ si lil0>a]sa 
Osi lax 


Push-Down Registers and Arrays 


These commands were designed for used by a compiler, not by people. They involve 
push-down registers and arrays. In addition to the stack that commands work on, DC can be 
thought of as having individual stacks for each register. These registers are operated on by 
the commands S and L. Sx pushes the top value of the main stack onto the stack for the 
register x. Lx pops the stack for register x and puts the result on the main stack. The com- 
mands s and | also work on registers but not as push-down stacks. | doesn’t effect the top of 
the register stack, and s destroys what was there before. 


The commands to work on arrays are : and ;. :x pops the stack and uses this value as an 
index into the array x. The next element on the stack is stored at this index in x. An index 
must be greater than or equal to 0 and less than 2048. ;x is the command to load the main 
stack from the array x. The value on the top of the stack is the index into the array x of the 
value to be loaded. 


Miscellaneous Commands 


The command ! interprets the rest of the line as a UNIX 
command and passes it to UNIX to execute. One other compiler command is Q. This com- 
mand uses the top of the stack as the number of levels of recursion to skip. 


DESIGN CHOICES 


The real reason for the use of a dynamic storage allocator was that a general purpose 
program could be (and in fact has been) used for a variety of other tasks. The allocator has 
some value for input and for compiling (i.e. the bracket [...] commands) where it cannot be 
known in advance how long a string will be. The result was that at a modest cost in execution 
time, all considerations of string allocation and sizes of strings were removed from the 
remainder of the program and debugging was made easier. The allocation method used wastes 
approximately 25% of available space. 


The choice of 100 as a base for internal arithmetic seemingly has no compelling advan- 
tage. Yet the base cannot exceed 127 because of hardware limitations and at the cost of 5% 
in space, debugging was made a great deal easier and decimal output was made much faster. 


The reason for a stack-type arithmetic design was to permit all DC commands from 
addition to subroutine execution to be implemented in essentially the same way. The result 
was a considerable degree of logical separation of the final program into modules with very lit- 
tle communication between modules. 


The rationale for the lack of interaction between the scale and the bases was to provide 
an understandable means of proceeding after a change of base or scale when numbers had 
already been entered. An earlier implementation which had global notions of scale and base 
did not work out well. If the value of scale were to be interpreted in the current input or 
output base, then a change of base or scale in the midst of a computation would cause great 
confusion in the interpretation of the results. The current scheme has the advantage that the 
value of the input and output bases are only used for input and output, respectively, and they 
are ignored in all other operations. The value of scale is not used for any essential purpose by 
any part of the program and it is used only to prevent the number of decimal places resulting 
from the arithmetic operations from growing beyond all bounds. 


The design rationale for the choices for the scales of the results of arithmetic were that 
in no case should any significant digits be thrown away if, on appearances, the user actually 
wanted them. Thus, if the user wants to add the numbers 1.5 and 3.517, it seemed reasonable 
to give him the result 5.017 without requiring him to unnecessarily specify his rather obvious 
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requirements for precision. 


On the other hand, multiplication and exponentiation produce results with many more 
digits than their operands and it seemed reasonable to give as a minimum the number of 
decimal places in the operands but not to give more than that number of digits unless the user 
asked for them by specifying a value for scale. Square root can be handled in just the same 
way as multiplication. The operation of division gives arbitrarily many decimal places and 
there is simply no way to guess how many places the user wants. In this case only, the user 
must specify a scale to get any decimal places at all. 


The scale of remainder was chosen to make it possible to recreate the dividend from the 
quotient and remainder. This is easy to implement; no digits are thrown away. 
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PART 3: TEXT EDITORS 


ULTRIX-32 offers five editors that you can use to create new files and modify existing files. 
Two of the six articles in this part describe the editor, ed. The remaining four articles 
describe edit, vi, ex, and sed. This introduction will help you compare the merits and features 
of the different editors and select an appropriate article. 


Type of 

Editor Editor Article 

edit Line Edit: A Tutorial 

ed Line A Tutorial Introduction to the UNIX Text Editor 
Advanced Editing on UNIX 

UL Screen An Introduction to Display 
Editing with Vi 

ex Line Ex Reference Manual 

sed Stream Sed - A Non-interactive Text Editor 


Edit and ed were developed for use on hard-copy terminals and video terminals connected to 
phone links slower than 1200 baud. If you have access to a video terminal on a medium or 
high-speed line (1200 baud or faster), vi is more appropriate. Ex is a general purpose line edi- 
tor (often the editor of choice), and sed is suitable for sophisticated users concerned with 
batch editing. 


edit 

“Rdit: A Tutorial” introduces the edit editor at a basic level. This editor is suitable for peo- 
ple new to the ULTRIX-32 system. Tutorials for four complete editing sessions make up the 
article on edit. These sessions advance from simple tasks to searching, substitution, and file 
recovery. 


ed 


“A Tutorial Introduction to the UNIX Text Editor” demonstrates the basic commands in ed. 
This editor is easy to use, but error messages provided with ed are not as helpful as error mes- 
sages for the other editors. The article includes examples and abundant explanations. 
“Advanced Editing on UNIX” covers those features of ed not explained in the first article, 
including using metacharacters, cutting and pasting, and making global changes. 
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vi 

Vi is the ULTRIX-32 system screen editor, and “An Introduction to Display Editing with Vi” 
offers a complete description. Vi is more efficient and easier to use than ed and edit, because 
it shows you as many as 24 lines of text at once. The screen display provides a context for the 
line you are entering or changing. You can move the cursor around on the screen with arrows 
and with address commands. The command set available to you in vi is large and flexible, 
and a set of options allows you to tailor the editor to suit your needs. The article on vi is 
appropriate for beginners as well as expert ULTRIX-32 system users; it progresses from sim- 
ple cursor positioning functions to sophisticated buffer filtering and macro facilities. 


ex 


Ex is a line editor, like edit and ed. However, ex offers a very large set of commands, options, 
and modes. In fact, edit and vi are modes (subsets) of ex. Ex is appropriate for novices as 
well as experienced users. However, the description of ex included here in the “Ex Reference 
Manual” is not a tutorial; it presents the rules that govern use of the editor and lists the com- 
mands and options alphabetically. Since edit is similar to but simpler than ex, you should 
find it helpful to read the article on edit first. The power and flexibility of ex make it the 
best editor for many applications. 


sed 


Sed, the stream editor, is an ULTRIX-32 system filter instead of an interactive editor. Sed 
can take its input either from the command line or from a script file (a file containing sed 
commands to be applied to the text file to be edited). It is most appropriate when used for 
editing functions that are repeated frequently as steps in a longer process, such as converting 
a list of users into a distribution list. The article “Sed - A Non-interactive Text Editor” pro- 
vides a reference with explanations and examples of sed commands. If you already know ed, 
you have a head start on learning sed, since sed commands resemble ed commands. How- 
ever, the interactive editors are easier to use and more practical in most cases than sed. 


Summary 


Most users choose vi to create and modify files. Ex, edit, and ed are good on slow phone lines 
and hard-copy terminals. Sed is best for experienced users with batch editing requirements. 
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Introduction 


Text editing using a terminal connected to a computer allows you to create, modify, and 
print text easily. A text editor is a program that assists you as you create and modify text. 
The text editor you will learn here is named edit. Creating text using edit is as easy as typing 
it on an electric typewriter. Modifying text involves telling the text editor what you want to 
add, change, or delete. You can review your text by typing a command to print the file con- 
tents as they were entered by you. Another program, a text formatter, rearranges your text 
for you into “finished form.” This document does not discuss the use of a text formatter. 


These lessons assume no prior familiarity with computers or with text editing. They 
consist of a series of text editing sessions which lead you through the fundamental steps of 
creating and revising text. After scanning each lesson and before beginning the next, you 
should practice the examples at a terminal to get a feeling for the actual process of text edit- 
ing. If you set aside some time for experimentation, you will soon become familiar with using 
the computer to write and modify text. In addition to the actual use of the text editor, other 
features of UNIX will be very important to your work. You can begin to learn about these 
other features by reading “Communicating with UNIX” or one of the other tutorials that pro- 
vide a general introduction to the system. You will be ready to proceed with this lesson as 
soon as you are familiar with (1) your terminal and its special keys, (2) the login procedure, 
(3) and the ways of correcting typing errors. Let’s first define some terms: 


program A set of instructions, given to the computer, describing the sequence of steps the 
computer performs in order to accomplish a specific task. The tasks must be 
specific, such as balancing your checkbook or editing your text. A general task, 
such as working for world peace, is something we can do, but not something we 
can write programs to do. 


UNIX UNIX is a special type of program, called an operating system, that supervises the 
machinery and all other programs comprising the total computer system. 


edit edit is the name of the UNIX text editor you will be learning to use, and is a pro- 
gram that aids you in writing or revising text. Edit was designed for beginning 
users, and is a simplified version of an editor named ex. 


file Each UNIX account is allotted space for the permanent storage of information, 
such as programs, data or text. A file is a logical unit of data, for example, an 
essay, a program, or a chapter from a book, which is stored on a computer sys- 
tem. Once you create a file, it is kept until you instruct the system to remove it. 
You may create a file during one UNIX session, end the session, and return to use 
it at a later time. Files contain anything you choose to write and store in them. 
The sizes of files vary to suit your needs; one file might hold only a single 
number, yet another might contain a very long document or program. The only 
way to save information from one session to the next is to store it in a file, which 
you will learn in Session 1. 
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filename 


disk 


buffer 


Filenames are used to distinguish one file from another, serving the same purpose 
as the labels of manila folders in a file cabinet. In order to write or access infor- 
mation in a file, you use the name of that file in a UNIX command, and the sys- 
tem will automatically locate the file. 


Files are stored on an input/output device called a disk, which looks something 
like a stack of phonograph records. Each surface is coated with a material simi- 
lar to the coating on magnetic recording tape, and information is recorded on it. 


A temporary work space, made available to the user for the duration of a session 
of text editing and used for creating and modifying the text file. We can think of 
the buffer as a blackboard that is erased after each class, where each session with 
the editor is a class. 
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Session 1 


Making contact with UNIX 


To use the editor you must first make contact with the computer by logging in to UNIX. 
We'll quickly review the standard UNIX login procedure for the two ways you can make con- 
tact: on a terminal that is directly linked to the computer, or over a telephone line where the 
computer answers your call. 


Directly-linked terminals 
Turn on your terminal and press the RETURN key. You are now ready to login. 


Dial-up terminals 


If your terminal connects with the computer over a telephone line, turn on the terminal, 
dial the system access number, and, when you hear a high-pitched tone, place the receiver of 
the telephone in the acoustic coupler. You are now ready to login. 


Logging in 
The message inviting you to login is: 
:login: 


Type your login name, which identifies you to UNIX, on the same line as the login message, 
and press RETURN. If the terminal you are using has both upper and lower case, be sure 
you enter your login name in lower case; otherwise UNIX assumes your terminal has 
only upper case and will not recognize lower case letters you may type. UNIX types “:login:” 
and you reply with your login name, for example “susan”: 


:login: susan (and press the RETURN key) 


(In the examples, input you would type appears in bold face to distinguish it from the 
responses from UNIX.) 


UNIX will next respond with a request for a password as an additional precaution to 
prevent unauthorized people from using your account. The password will not appear when 
you type it, to prevent others from seeing it. The message is: 


Password: (type your password and press RETURN) 


If any of the information you gave during the login sequence was mistyped or incorrect, UNIX 
will respond with 


Login incorrect. 
:login: 


in which case you should start the login process anew. Assuming that you have successfully 
logged in, UNIX will print the message of the day and eventually will present you with a % at 
the beginning of a fresh line. The % is the UNIX prompt symbol which tells you that UNIX is 
ready to accept a command. 


Asking for edit 


You are ready to tell UNIX that you want to work with edit, the text editor. Now is a 
convenient time to choose a name for the file of text you are about to create. To begin your 
editing session, type edit followed by a space and then the filename you have selected; for 
example, “text”. When you have completed the command, press the RETURN key and wait for 
edit’s response: 
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% edit text (followed by a RETURN) 
”text” No such file or directory 


If you typed the command correctly, you will now be in communication with edit. Edit has 
set aside a buffer for use as a temporary working space during your current editing session. It 
also checked to see if the file you named, “text”, already existed. It was unable to find such a 
file, since “text” is a new file we are about to create. Edit confirms this with the line: 


”text” No such file or directory 


On the next line appears edit’s prompt “:”, announcing that you are in command mode and 
edit expects a command from you. You may now begin to create the new file. 


The “Command not found” message 
If you misspelled edit by typing, say, “editor”, your request would be handled as follows: 


% editor 
editor: Command not found 
% 


Your mistake in calling edit “editor” was treated by UNIX as a request for a program named 
“editor”. Since there is no program named “editor”, UNIX reported that the program was “not 
found”. A new % indicates that UNIX is ready for another command, and you may then enter 
the correct command. 


A summary 


Your exchange with UNIX as you logged in and made contact with edit should look some- 
thing like this: 


login: susan 

Password: 

... A Message of General Interest ... 
% edit text 

”text” No such file or directory 


Entering text 


You may now begin entering text into the buffer. This is done by appending (or adding) 
text to whatever is currently in the buffer. Since there is nothing in the buffer at the moment, 
you are appending text to nothing; in effect, since you are adding text to nothing you are 
creating text. Most edit commands have two forms: a word that suggests what the command 
does, and a shorter abbreviation of that word. Either form may be used. Many beginners find 
the full command names easier to remember at first, but once you are familiar with editing 
you may prefer to type the shorter abbreviations. The command to input text is “append”, 
and it may be abbreviated “a”. Type append and press the RETURN key. 


% edit text 
:append 


Messages from edit 


If you make a mistake in entering a command and type something that edit does not 
recognize, edit will respond with a message intended to help you diagnose your error. For 
example, if you misspell the command to input text by typing, perhaps, “add” instead of 
“append” or “‘a’”’, you will receive this message: 
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:add 
add: Not an editor command 


When you receive a diagnostic message, check what you typed in order to determine what part 
of your command confused edit. The message above means that edit was unable to recognize 
your mistyped command and, therefore, did not execute it. Instead, a new “:” appeared to let 
you know that edit is again ready to execute a command. 


Text input mode 


By giving the command “append” (or using the abbreviation “a’”), you entered text 
input mode, also known as append mode. When you enter text input mode, edit stops send- 
ing you a prompt. You will not receive any prompts or error messages while in text input 
mode. You can enter pretty much anything you want on the lines. The lines are transmitted 
one by one to the buffer and held there during the editing session. You may append as much 
text as you want, and when you wish to stop entering text lines you should type a period as 
the only character on the line and press the RETURN key. When you type the period and 
press RETURN, you signal that you want to stop appending text, and edit responds by allowing 
you to exit text input mode and reenter command mode. Edit will again prompt you for a 
command by printing “:”. 


Leaving append mode does not destroy the text in the buffer. You have to leave append 
mode to do any of the other kinds of editing, such as changing, adding, or printing text. If 
you type a period as the first character and type any other character on the same line, edit 
will believe you want to remain in append mode and will not let you out. As this can be very 
frustrating, be sure to type only the period and the RETURN key. 


This is a good place to learn an important lesson about computers and text: a blank 
space is a character as far as a computer is concerned. If you so much as type a period fol- 
lowed by a blank (that is, type a period and then the space bar on the keyboard), you will 
remain in append mode with the last line of text being: 


Let’s say that the lines of text you enter are (try to type exactly what you see, including 
“thiss”’): 


This is some sample text. 
And thiss is some more text. 
Text editing is strange, but nice. 


The last line is the period followed by a RETURN that gets you out of append mode. 


Making corrections 


If you have read a general introduction to UNIX, such as “Communicating with UNIX”, 
you will recall that it is possible to erase individual letters that you have typed. This is done 
by typing the designated erase character as many times as there are characters you want to 
erase. 


The usual erase character is the backspace (control-H), and you can correct typing errors 
in the line you are typing by holding down the CTRL key and typing the “H” key. If you try 
typing control-H you will notice that the terminal backspaces in the line you are on. You can 
backspace over your error, and then type what you want to be the rest of the line. 


If you make a bad start in a line and would like to begin again, you can either backspace 
to the beginning of the line or you can use the at-sign “@” to erase everything on the line: 
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Text edtiing is strange, but @ 
Text editing is strange, but nice. 


When you type the at-sign (@), you erase the entire line typed so far and are given a fresh 
line to type on. You may immediately begin to retype the line. This, unfortunately, does not 
help after you type the line and press RETURN. To make corrections in lines that have been 
completed, it is necessary to use the editing commands covered in the next session and those 
that follow. 


Writing text to disk 


You are now ready to edit the text. The simplest kind of editing is to write it to disk as 
a file for safekeeping after the session is over. This is the only way to save information from 
one session to the next, since the editor’s buffer is temporary and will last only until the end 
of the editing session. Learning how to write a file to disk is second in importance only to 
entering the text. To write the contents of the buffer to a disk file, use the command “write” 
(or its abbreviation “w’’): 


:write 
Edit will copy the contents of the buffer to a disk file. If the file does not yet exist, a new file 
will be created automatically and the presence of a “{New file]” will be noted. The newly- 
created file will be given the name specified when you entered the editor, in this case “text”. 
To confirm that the disk file has been successfully written, edit will repeat the filename and 
give the number of lines and the total number of characters in the file. The buffer remains 


unchanged by the “write” command. All of the lines that were written to disk will still be in 
the buffer, should you want to modify or add to them. 


Edit must have a filename to use before it can write a file. If you forgot to indicate the 
name of the file when you began the editing session, edit will print 


No current filename 


in response to your write command. If this happens, you can specify the filename in a new 
write command: 


:write text 


After the “write” (or “w’’), type a space and then the name of the file. 


Signing off 


We have done enough for this first lesson on using the UNIX text editor, and are ready to 
quit the session with edit. To do this we type “quit” (or “q”) and press RETURN: 


: write 

*text” [New file] 3 lines, 90 characters 

:quit 

% 
The % is from UNIX to tell you that your session with edit is over and you may command 
UNIX further. Since we want to end the entire session at the terminal, we also need to exit 
from UNIX. In response to the UNIX prompt of “% ” type the command 


% logout 


This will end your session with UNIX, and will ready the terminal for the next user. It is 
always important to type logout at the end of a session to make absolutely sure no one could 
accidentally stumble into your abandoned session and thus gain access to your files, tempting 
even the most honest of souls. 


This is the end of the first session on UNIX text editing. 
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Session 2 


Login with UNIX as in the first session: 


login: susan (carriage return) 
Password: (give password and carriage return) 


... A Message of General Interest ... 
% 


When you indicate you want to edit, you can specify the name of the file you worked on last 
time. This will start edit working, and it will fetch the contents of the file into the buffer, so 
that you can resume editing the same file. When edit has copied the file into the buffer, it 
will repeat its name and tell you the number of lines and characters it contains. Thus, 


% edit text 
*text” 3 lines, 90 characters 


means you asked edit to fetch the file named “text” for editing, causing it to copy the 90 char- 
acters of text into the buffer. Edit awaits your further instructions, and indicates this by its 
prompt character, the colon (:). In this session, we will append more text to our file, print the 
contents of the buffer, and learn to change the text of a line. 


Adding more text to the file 


If you want to add more to the end of your text you may do so by using the append 
command to enter text input mode. When “append” is the first command of your editing ses- 
sion, the lines you enter are placed at the end of the buffer. Here we’ll use the abbreviation 
for the append command, “a”: 


:a 

This is text added in Session 2. 
It doesn’t mean much here, but 
it does illustrate the editor. 


You may recall that once you enter append mode using the “a” (or “append”) command, you 
need to type a line containing only a period (.) to exit append mode. 


Interrupt 


Should you press the RUB key (sometimes labelled DELETE) while working with edit, it 
will send this message to you: 


Interrupt 


Any command that edit might be executing is terminated by rub or delete, causing edit to 
prompt you for a new command. If you are appending text at the time, you will exit from 
append mode and be expected to give another command. The line of text you were typing 
when the append command was interrupted will not be entered into the buffer. 


Making corrections 


If while typing the line you hit an incorrect key, recall that you imay delete the incorrect 
character or cancel the entire line of input by erasing in the usual way. Refer either to the 
last few pages of Session 1 or to “Communicating with UNIX” if you need to review the pro- 
cedures for making a correction. The most important idea to remember is that erasing a char- 
acter or cancelling a line must be done before you press the RETURN key. 
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Listing what’s in the buffer (p) 


Having appended text to what you wrote in Session 1, you might want to see all the lines 
in the buffer. To print the contents of the buffer, type the command: 


:1,$p 


The “1”+ stands for line 1 of the buffer, the “$” is a special symbol designating the last line of 
the buffer, and “p” (or print) is the command to print from line 1 to the end of the buffer. 
The command “1,$p” gives you: 


This is some sample text. 

And thiss is some more text. 
Text editing is strange, but nice. 
This is text added in Session 2. 
It doesn’t mean much here, but 
it does illustrate the editor. 


Occasionally, you may accidentally type a character that can’t be printed, which can be done 
by striking a key while the CTRL key is pressed. In printing lines, edit uses a special notation 
to show the existence of non-printing characters. Suppose you had introduced the non- 
printing character “control-A” into the word “illustrate” by accidently pressing the CTRL key 
while typing “a”. This can happen on many terminals because the CTRL key and the “A” key 
are beside each other. If your finger presses between the two keys, control-A results. When 
asked to print the contents of the buffer, edit would display 


it does illustr*Ate the editor. 


To represent the control-A, edit shows ““A”. The sequence ‘““” followed by a capital letter 
stands for the one character entered by holding down the CTRL key and typing the letter 
which appears after the “*’’. We’ll soon discuss the commands that can be used to correct this 
typing error. 


In looking over the text we see that “this” is typed as “thiss” in the second line, a deli- 
berate error so we can learn to make corrections. Let’s correct the spelling. 


Finding things in the buffer 


In order to change something in the buffer we first need to find it. We can find “thiss” 
in the text we have entered by looking at a listing of the lines. Physically speaking, we search 
the lines of text looking for “thiss” and stop searching when we have found it. The way to tell 
edit to search for something is to type it inside slash marks: 


:/thiss/ 


By typing /thiss/ and pressing RETURN, you instruct edit to search for “thiss”. If you ask 
edit to look for a pattern of characters which it cannot find in the buffer, it will respond “Pat- 
tern not found”. When edit finds the characters ‘‘thiss”, it will print the line of text for your 
inspection: 


And thiss is some more text. 


Edit is now positioned in the buffer at the line it just printed, ready to make a change in the 
line. 


+The numeral “one” is the top left-most key, and should not be confused with the letter “el”. 
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The current line 


Edit keeps track of the line in the buffer where it is located at all times during an edit- 
ing session. In general, the line that has been most recently printed, entered, or changed is 
the current location in the buffer. The editor is prepared to make changes at the current loca- 
tion in the buffer, unless you direct it to another location. 


In particular, when you bring a file into the buffer, you will be located at the last line in 
the file, where the editor left off copying the lines from the file to the buffer. If your first 
editing command is “append”, the lines you enter are added to the end of the file, after the 
current line — the last line in the file. 


You can refer to your current location in the buffer by the symbol period (.) usually 
known by the name “dot”. If you type “.” and carriage return you will be instructing edit to 
print the current line: 


And thiss is some more text. 


If you want to know the number of the current line, you can type .= and press RETURN, 
and edit will respond with the line number: 


2 


If you type the number of any line and press RETURN, edit will position you at that line and 
print its contents: 


:2 
And thiss is some more text. 


You should experiment with these commands to gain experience in using them to make 
changes. 


Numbering lines (nu) 


The number (nu) command is similar to print, giving both the number and the text of 
each printed line. To see the number and the text of the current line type 


:nu 
2 And thiss is some more text. 


Note that the shortest abbreviation for the number command is “nu” (and not “n”, which is 
used for a different command). You may specify a range of lines to be listed by the number 
command in the same way that lines are specified for print. For example, 1,$nu lists all lines 
in the buffer with their corresponding line numbers. 


Substitute command (s) 


Now that you have found the misspelled word, you can change it from “thiss” to “this”’. 
As far as edit is concerned, changing things is a matter of substituting one thing for another. 
As a stood for append, so s stands for substitute. We will use the abbreviation “s” to reduce 
the chance of mistyping the substitute command. This command will instruct edit to make 
the change: 


2s/thiss/this/ 


We first indicate the line to be changed, line 2, and then type an “s” to indicate we want edit 
to make a substitution. Inside the first set of slashes are the characters that we want to 
change, followed by the characters to replace them, and then a closing slash mark. To sum- 
marize: 
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2s/ what is to be changed / what to change it to / 


If edit finds an exact match of the characters to be changed it will make the change only in 
the first occurrence of the characters. If it does not find the characters to be changed, it will 
respond: 


Substitute pattern match failed 


indicating that your instructions could not be carried out. When edit does find the characters 
that you want to change, it will make the substitution and automatically print the changed 
line, so that you can check that the correct substitution was made. In the example, 


:2s/thiss/this/ 
And this is some more text. 


line 2 (and line 2 only) will be searched for the characters “thiss”, and when the first exact 
match is found, “thiss” will be changed to “this”: Strictly speaking, it was not necessary 
above to specify the number of the line to be changed. In 


ee s/thiss/this/ 


edit will assume that we mean to change the line where we are currently located (‘.”). In this 
case, the command without a line number would have produced the same result because we 
were already located at the line we wished to change. 


For another illustration of the substitute command, let us choose the line: 
Text editing is strange, but nice. 


You can make this line a bit more positive by taking out the characters “strange, but ” so the 
line reads: 


Text editing is nice. 
A command that will first position edit at the desired line and then make the substitution is: 


:/strange/s/strange, but // 


What we have done here is combine our search with our substitution. Such combinations are 
perfectly legal, and speed up editing quite a bit once you get used to them. That is, you do 
not necessarily have to use line numbers to identify a line to edit. Instead, you may identify 
the line you want to change by asking edit to search for a specified pattern of letters that 
occurs in that line. The parts of the above command are: | 


/strange/ tells edit to find the characters “strange” in the text 
s tells edit to make a substitution 
/strange, but // substitutes nothing at all for the characters “strange, but ” 


You should note the space after “but” in “/strange, but /”. If you do not indicate that 
the space is to be taken out, your line will read: 


Text editing is nice. 


which looks a little funny because of the extra space between “is” and “nice”. Again, we real- 
ize from this that a blank space is a real character to a computer, and in editing text we need 
to be aware of spaces within a line just as we would be aware of an “a” or a “4”. 


Another way to list what’s in the buffer (z) 


Although the print command is useful for looking at specific lines in the buffer, other 
commands may be more convenient for viewing large sections of text. You can ask to see a 
screen full of text at a time by using the command z. If you type 
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:1z 


edit will start with line 1 and continue printing lines, stopping either when the screen of your 
terminal is full or when the last line in the buffer has been printed. If you want to read the 
next segment of text, type the command 


72 


If no starting line number is given for the z command, printing will start at the “current” line, 
in this case the last line printed. Viewing lines in the buffer one screen full at a time is known 
as paging. Paging can also be used to print a section of text on a hard-copy terminal. 


Saving the modified text 
This seems to be a good place to pause in our work, and so we should end the second 
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session. If you (in haste) type “q” to quit the session your dialogue with edit will be: 


:q 
No write since last change (:quit! overrides) 


This is edit’s warning that you have not written the modified contents of the buffer to disk. 
You run the risk of losing the work you did during the editing session since you typed the 
latest write command. Because in this lesson we have not written to disk at all, everything we 
have done would have been lost if edit had obeyed the q command. If you did not want to 
save the work done during this editing session, you would have to type “q!” or (“quit!”) to 
confirm that you indeed wanted to end the session immediately, leaving the file as it was after 
the most recent “write” command. However, since you want to save what you have edited, 
you need to type: 


[Ww 
”text” 6 lines, 171 characters 


and then follow with the commands to quit and logout: 


:q 

% logout 
and hang up the phone or turn off the terminal when UNIX asks for a name. Terminals con- 
nected to the port selector will stop after the logout command, and pressing keys on the key- 
board will do nothing. 


This is the end of the second session on UNIX text editing. 
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Session 3 


Bringing text into the buffer (e) 


Login to UNIX and make contact with edit. You should try to login without looking at 
the notes, but if you must then by all means do. 


Did you remember to give the name of the file you wanted to edit? That is, did you 
type 
% edit text 
or simply 
% edit 


Both ways get you in contact with edit, but the first way will bring a copy of the file named 
“text” into the buffer. If you did forget to tell edit the name of your file, you can get it into 
the buffer by typing: 


:e text 
*text” 6 lines, 171 characters 


The command edit, which may be abbreviated e, tells edit that you want to erase anything 
that might already be in the buffer and bring a copy of the file “text” into the buffer for edit- 
ing. You may also use the edit (e) command to change files in the middle of an editing ses- 
sion, or to give edit the name of a new file that you want to create. Because the edit com- 
mand clears the buffer, you will receive a warning if you try to edit a new file without having 
saved a copy of the old file. This gives you a chance to write the contents of the buffer to disk 
before editing the next file. 


Moving text in the buffer (m) 


Edit allows you to move lines of text from one location in the buffer to another by means 
of the move (m) command. The first two examples are for illustration only, though after you 
have read this Session you are welcome to return to them for practice. The command 


:2,4m$ 


directs edit to move lines 2, 3, and 4 to the end of the buffer ($). The format for the move 
command is that you specify the first line to be moved, the last line to be moved, the move 
command “m”, and the line after which the moved text is to be placed. So, 


:1,3m6 


would instruct edit to move lines 1 through 3 (inclusive) to a location after line 6 in the buffer. 
To move only one line, say, line 4, to a location in the buffer after line 5, the command would 
be “4m5”. 


Let’s move some text using the command: 


:5,$m1 
2 lines moved 
it does illustrate the editor. 


After executing a command that moves more than one line of the buffer, edit tells how many 
lines were affected by the move and prints the last moved line for your inspection. If you 
want to see more than just the last line, you can then use the print (p), z, or number (nu) 
command to view more text. The buffer should now contain: 
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This is some sample text. 

It doesn’t mean much here, but 
it does illustrate the editor. 
And this is some more text. 
Text editing is nice. 

This is text added in Session 2. 


You can restore the original order by typing: 
:4,$m1 
or, combining context searching and the move command: 
:/And this is some/,/This is text/m/This is some sample/ 


(Do not type both examples here!) The problem with combining context searching with the 
move command is that your chance of making a typing error in such a long command is 
greater than if you type line numbers. 


Copying lines (copy) 


The copy command is used to make a second copy of specified lines, leaving the original 
lines where they were. Copy has the same format as the move command, for example: 


:2,5copy $ 


makes a copy of lines 2 through 5, placing the added lines after the buffer’s end ($). Experi- 
ment with the copy command so that you can become familiar with how it works. Note that 
the shortest abbreviation for copy is co (and not the letter “c’”, which has another meaning). 


Deleting lines (d) 
Suppose you want to delete the line 
This is text added in Session 2. 


from the buffer. If you know the number of the line to be deleted, you can type that number 
followed by delete or d. This example deletes line 4, which is “This is text added in Session 
2.” if you typed the commands suggested so far. 


:4d 
It doesn’t mean much here, but 


Here “4” is the number of the line to be deleted, and “delete” or “d” is the command to 
delete the line. After executing the delete command, edit prints the line that has become the 
current line (“‘.”’). 


If you do not happen to know the line number you can search for the line and then 
delete it using this sequence of commands: 


:/added in Session 2./ 

This is text added in Session 2. 
:d 

It doesn’t mean much here, but 


The “/added in Session 2./” asks edit to locate and print the line containing the indicated 
text, starting its search at the current line and moving line by line until it finds the text. 
Once you are sure that you have correctly specified the line you want to delete, you can enter 
the delete (d) command. In this case it is not necessary to specify a line number before the 
“qd”. If no line number is given, edit deletes the current line (“.”), that is, the line found by 
our search. After the deletion, your buffer should contain: 
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This is some sample text. 

And this is some more text. 
Text editing is nice. 

It doesn’t mean much here, but 
it does illustrate the editor. 
And this is some more text. 
Text editing is nice. 

This is text added in Session 2. 
It doesn’t mean much here, but 


To delete both lines 2 and 3: 


And this is some more text. 
Text editing is nice. 


you type 
:2,3d 
2 lines deleted 
which specifies the range of lines from 2 to 3, and the operation on those lines — ‘“d” for 


delete. If you delete more than one line you will receive a message telling you the number of 
lines deleted, as indicated in the example above. 


The previous example assumes that you know the line numbers for the lines to be 
deleted. If you do not you might combine the search command with the delete command: 


:/And this is some/,/Text editing is nice./d 


A word or two of caution 


In using the search function to locate lines to be deleted you should be absolutely 
sure the characters you give as the basis for the search will take edit to the line you want 
deleted. Edit will search for the first occurrence of the characters starting from where you last 
edited — that is, from the line you see printed if you type dot (.). 


A search based on too few characters may result in the wrong lines being deleted, which 
edit will do as easily as if you had meant it. For this reason, it is usually safer to specify the 
search and then delete in two separate steps, at least until you become familiar enough with 
using the editor that you understand how best to specify searches. For a beginner it is not a 
bad idea to double-check each command before pressing RETURN to send the command on its 
way. 


Undo (u) to the rescue 


The undo (u) command has the ability to reverse the effects of the last command that 
changed the buffer. To undo the previous command, type ‘“u” or “undo”. Undo can rescue 
the contents of the buffer from many an unfortunate mistake. However, its powers are not 
unlimited, so it is still wise to be reasonably careful about the commands you give. 


It is possible to undo only commands which have the power to change the buffer — for 
example, delete, append, move, copy, substitute, and even undo itself. The commands write 
(w) and edit (e), which interact with disk files, cannot be undone, nor can commands that do 
not change the buffer, such as print. Most importantly, the only command that can be 
reversed by undo is the last “undo-able” command you typed. You can use control-H and @ 
to change commands while you are typing them, and undo to reverse the effect of the com- 
mands after you have typed them and pressed RETURN. 


To illustrate, let’s issue an undo command. Recall that the last buffer-changing com- 


mand we gave deleted the lines formerly numbered 2 and 3. Typing undo at this moment will 
reverse the effects of the deletion, causing those two lines to be replaced in the buffer. 
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:u 
2 more lines in file after undo 
And this is some more text. 


Here again, edit informs you if the command affects more than one line, and prints the text of 
the line which is now “dot” (the current line). 


More about the dot (.) and buffer end ($) 
The function assumed by the symbol dot depends on its context. It can be used: 
1. to exit from append mode; we type dot (and only a dot) on a line and press RETURN; 
2. to refer to the line we are at in the buffer. 


Dot can also be combined with the equal sign to get the number of the line currently being 
edited: 
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_ If we type “.=” we are asking for the number of the line, and if we type “.”’ we are asking for 
the text of the line. 


In this editing session and the last, we used the dollar sign to indicate the end of the 
buffer in commands such as print, copy, and move. The dollar sign as a command asks edit to 
print the last line in the buffer. If the dollar sign is combined with the equal sign ($=) edit 
will print the line number corresponding to the last line in the buffer. 


“.” and “$”, then, represent line numbers. Whenever appropriate, these symbols can be 
used in place of line numbers in commands. For example 


:..$d 


instructs edit to delete all lines from the current line (.) to the end of the buffer. 


Moving around in the buffer (+ and —) 


When you are editing you often want to go back and re-read a previous line. You could 
specify a context search for a line you want to read if you remember some of its text, but if 
you simply want to see what was written a few, say 3, lines ago, you can type 


This tells edit to move back to a position 3 lines before the current line (.) and print that line. 
You can move forward in the buffer similarly: 


+2p 


instructs edit to print the line that is 2 ahead of your current position. 


” 


You may use “+” and “—” in any command where edit accepts line numbers. Line 
numbers specified with “+” or “—” can be combined to print a range of lines. The command 


:—1,+2copy$ 


makes a copy of 4 lines: the current line, the line before it, and the two after it. The copied 
lines will be placed after the last line in the buffer ($), and the original lines referred to by 
“—1”? and “+2” remain where they are. 


Try typing only “—”; you will move back one line just as if you had typed “—1p”. Typ- 
ing the command “+” works similarly. You might also try typing a few plus or minus signs in 
a row (such as “+++”) to see edit’s response. Typing RETURN alone on a line is the 
equivalent of typing “+1p”; it will move you one line ahead in the buffer and print that line. 


If you are at the last line of the buffer and try to move further ahead, perhaps by typing 
a “+” or a carriage return alone on the line, edit will remind ou that you are at the end of 
the buffer: 
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At end-of-file 
or 
Not that many lines in buffer 


Similarly, if you try to move to a position before the first line, edit will print one of these mes- 
sages: 


Nonzero address required on this command 
or 
Negative address — first buffer line is 1 


The number associated with a buffer line is the line’s “address”, in that it can be used to 
locate the line. 


Changing lines (c) 


You can also delete certain lines and insert new text in their place. This can be accom- 
plished easily with the change (c) command. The change command instructs edit to delete 
specified lines and then switch to text input mode to accept the text that will replace them. 
Let’s say you want to change the first two lines in the buffer: 


This is some sample text. 
And this is some more text. 


to read 
This text was created with the UNIX text editor. 
To do so, you type: 


:1,2c 
2 lines changed 
This text was created with the UNIX text editor. 


In the command 1,2c we specify that we want to change the range of lines beginning with 1 
and ending with 2 by giving line numbers as with the print command. These lines will be 
deleted. After you type RETURN to end the change command, edit notifies you if more than 
one line will be changed and places you in text input mode. Any text typed on the following 
lines will be inserted into the position where lines were deleted by the change command. You 
will remain in text input mode until you exit in the usual way, by typing a period 
alone on a line. Note that the number of lines added to the buffer need not be the same as 
the number of lines deleted. 


This is the end of the third session on text editing with UNIX. 
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Session 4 


This lesson covers several topics, starting with commands that apply throughout the 
buffer, characters with special meanings, and how to issue UNIX commands while in the editor. 
The next topics deal with files: more on reading and writing, and methods of recovering files 
lost in a crash. The final section suggests sources of further information. 


Making commands global (g) 


One disadvantage to the commands we have used for searching or substituting is that if 
you have a number of instances of a word to change it appears that you have to type the com- 
mand repeatedly, once for each time the change needs to be made. Edit, however, provides a 
way to make commands apply to the entire contents of the buffer — the global (g) com- 
mand. 


To print all lines containing a certain sequence of characters (say, “text”) the command 
is: 
:g/text/p 


The “g” instructs edit to make a global search for all lines in the buffer containing the charac- 
ters “text”. The “p” prints the lines found. 


To issue a global command, start by typing a “g” and then a search pattern identifying 
the lines to be affected. Then, on the same line, type the command to be executed for the 
identified lines. Global substitutions are frequently useful. For example, to change all 
instances of the word “text” to the word “material” the command would be a combination of 
the global search and the substitute command: 


:g/text/s/text/material/g 


Note the “g” at the end of the global command, which instructs edit to change each and every 
instance of “text” to “material”. If you do not type the “g” at the end of the command only 
the first instance of “text” in each line will be changed (the normal result of the substitute 
command). The “g” at the end of the command is independent of the “g’”’ at the beginning. 
You may give a command such as: 


:5s/text/material/g 


to change every instance of “text” in line 5 alone. Further, neither command will change 
“text” to “material” if “Text” begins with a capital rather than a lower-case t. 
Edit does not automatically print the lines modified by a global command. If you want 
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the lines to be printed, type a ‘“‘p” at the end of the global command: 
:g/text/s/text/material/gp 


You should be careful about using the global command in combination with any other — in 
essence, be sure of what you are telling edit to do to the entire buffer. For example, 


:g//d 
72 less lines in file after global 


will delete every line containing a blank anywhere in it. This could adversely affect your 
document, since most lines have spaces between words and thus would be deleted. After exe- 
cuting the global command, edit will print a warning if the command added or deleted more 
than one line. Fortunately, the undo command can reverse the effects of a global command. 
You should experiment with the global command on a small file of text to see what it can do 
for you. 
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More about searching and substituting 


In using slashes to identify a character string that we want to search for or change, we 
have always specified the exact characters. There is a less tedious way to repeat the same 
string of characters. To change “text” to “texts” we may type either 


:/text/s/text/texts/ 
as we have done in the past, or a somewhat abbreviated command: 
:/text/s//texts/ 


In this example, the characters to be changed are not specified — there are no characters, not 
even a space, between the two slash marks that indicate what is to be changed. This lack of 
characters between the slashes is taken by the editor to mean “use the characters we last 
searched for as the characters to be changed.” 


Similarly, the last context search may be repeated by typing a pair of slashes with noth- 
ing between them: 


:/does/ 

It doesn’t mean much here, but 
olf 

it does illustrate the editor. 


(You should note that the search command found the characters “does” in the word “doesn’t” 
in the first search request.) Because no characters are specified for the second search, the edi- 
tor scans the buffer for the next occurrence of the characters ‘‘does”. 


Edit normally searches forward through the buffer, wrapping around from the end of the 
buffer to the beginning, until the specified character string is found. If you want to search in 
the reverse direction, use question marks (?) instead of slashes to surround the characters you 
are searching for. 


It is also possible to repeat the last substitution without having to retype the entire com- 
mand. An ampersand (&) used as a command repeats the most recent substitute command, 
using the same search and replacement patterns. After altering the current line by typing 


:s/text/texts/ 
you type 

:/text/& 
or simply 

M&S 


to make the same change on the next line in the buffer containing the characters “text”’. 


Special characters 


Two characters have special meanings when used in specifying searches: “$” and ‘“*”. 
“$” is taken by the editor to mean “end of the line” and is used to identify strings that occur 
at the end of a line. 


: g/text.$/s//material./p 


tells the editor to search for all lines ending in “text.” (and nothing else, not even a blank 
space), to change each final “text.” to “material.”, and print the changed lines. 


The symbol ““” indicates the beginning of a line. Thus, 
:s//1. / 


instructs the editor to insert ‘‘1.” and a space at the beginning of the current line. 
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The characters ‘“$” and ‘““” have special meanings only in the context of searching. At 
other times, they are ordinary characters. If you ever need to search for a character that has a 
special meaning, you must indicate that the character is to lose temporarily its special 
significance by typing another special character, the backslash (\), before it. 


:s/\$/dollar/ 


looks for the character “$” in the current line and replaces it by the word “dollar”. Were it 
not for the backslash, the “$” would have represented “the end of the line” in your search . 
rather than the character “$”. The backslash retains its special significance unless it is pre- 
ceded by another backslash. 


Issuing UNIX commands from the editor 


After creating several files with the editor, you may want to delete files no longer useful 
to you or ask for a list of your files. Removing and listing files are not functions of the editor, 
and so they require the use of UNIX system commands (also referred to as “shell” commands, 
as “shell” is the name of the program that processes UNIX commands). You do not need to 
quit the editor to execute a UNIX command as long as you indicate that it is to be sent to the 
shell for execution. To use the UNIX command rm to remove the file named “junk” type: 

:Irm junk 
! 


The exclamation mark (!) indicates that the rest of the line is to be processed as a shell com- 
mand. If the buffer contents have not been written since the last change, a warning will be 
printed before the command is executed: 


[No write since last change] 


6eyoo 
. 


The editor prints a when the command is completed. The tutorial “Communicating with 
UNIX” describes useful features of the system, of which the editor is only one part. 


Filenames and file manipulation 


Throughout each editing session, edit keeps track of the name of the file being edited as 
the current filename. Edit remembers as the current filename the name given when you 
entered the editor. The current filename changes whenever the edit (e) command is used to 
specify a new file. Once edit has recorded a current filename, it inserts that name into any 
command where a filename has been omitted. If a write command does not specify a file, edit, 
as we have seen, supplies the current filename. If you are editing a file named “draft3” having 
283 lines in it, you can have the editor write onto a different file by including its name in the 
write command: 


:w chapter3 
*chapter3” [new file] 283 lines, 8698 characters 


The current filename remembered by the editor will not be changed as a result of the write 
command. Thus, if the next write command does not specify a name, edit will write onto the 
current file (“draft3”’) and not onto the file “chapter3”’. 


The file (f) command 


To ask for the current filename, type file (or f). In response, the editor provides current 
information about the buffer, including the filename, your current position, the number of 
lines in the buffer, and the percent of the distance through the file your current location is. 


Si 
*text” [Modified] line 3 of 4 --75% -- 


If the contents of the buffer have changed since the last time the file was written, the editor 
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will tell you that the file has been “[Modified]”. After you save the changes by writing onto a 
disk file, the buffer will no longer be considered modified: 


[Ww 
*text” 4 lines, 88 characters 
if 

“text” line 3 of 4 --75%-- 


Reading additional files (r) 


The read (r) command allows you to add the contents of a file to the buffer at a 
specified location, essentially copying new lines between two existing lines. To use it, specify 
the line after which the new text will be placed, the read (r) command, and then the name of 
the file. If you have a file named “example”, the command 


:$r example 
example” 18 lines, 473 characters 


reads the file “example” and adds it to the buffer after the last line. The current filename is 
not changed by the read command. 


Writing parts of the buffer 


The write (w) command can write all or part of the buffer to a file you specify. We are 
already familiar with writing the entire contents of the buffer to a disk file. To write only 
part of the buffer onto a file, indicate the beginning and ending lines before the write com- 
mand, for example 


:45,$w ending 


Here all lines from 45 through the end of the buffer are written onto the file named ending. 
The lines remain in the buffer as part of the document you are editing, and you may continue 
to edit the entire buffer. Your original file is unaffected by your command to write part of the 
buffer to another file. Edit still remembers whether you have saved changes to the buffer in 
your original file or not. 


Recovering files 


Although it does not happen very often, there are times UNIX stops working because of 
some malfunction. This situation is known as a crash. Under most circumstances, edit’s 
crash recovery feature is able to save work to within a few lines of changes before a crash (or 
an accidental phone hang up). If you lose the contents of an editing buffer in a system crash, 
you will normally receive mail when you login that gives the name of the recovered file. To 
recover the file, enter the editor and type the command recover (rec), followed by the name 
of the lost file. For example, to recover the buffer for an edit session involving the file 
“chap6”, the command is: 


:recover chap6 


Recover is sometimes unable to save the entire buffer successfully, so always check the con- 
tents of the saved buffer carefully before writing it back onto the original file. For best 
results, write the buffer to a new file temporarily so you can examine it without risk to the ori- 
ginal file. Unfortunately, you cannot use the recover command to retrieve a file you removed 
using the shell command rm. 


Other recovery techniques 


If something goes wrong when you are using the editor, it may be possible to save your 
work by using the command preserve (pre), which saves the buffer as if the system had 
crashed. If you are writing a file and you get the message “Quota exceeded”, you have tried to 
use more disk storage than is allotted to your account. Proceed with caution because it is 
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likely that only a part of the editor’s buffer is now present in the file you tried to write. In 
this case you should use the shell escape from the editor (!) to remove some files you don’t 
need and try to write the file again. If this is not possible and you cannot find someone to 
help you, enter the command 


: preserve 
and wait for the reply, 
File preserved. 


If you do not receive this reply, seek help immediately. Do not simply leave the editor. If you 
do, the buffer will be lost, and you may not be able to save your file. If the reply is “File 
preserved.” you can leave the editor (or logout) to remedy the situation. After a preserve, you 
can use the recover command once the problem has been corrected, or the —r option of the 
edit command if you leave the editor and want to return. 


If you make an undesirable change to the buffer and type a write command before dis- 
covering your mistake, the modified version will replace any previous version of the file. 
Should you ever lose a good version of a document in this way, do not panic and leave the edi- 
tor. As long as you stay in the editor, the contents of the buffer remain accessible. Depend- 
ing on the nature of the problem, it may be possible to restore the buffer to a more complete 
state with the undo command. After fixing the damaged buffer, you can again write the file to 
disk. 


Further reading and other information 


Edit is an editor designed for beginning and casual users. It is actually a version of a 
more powerful editor called ex. These lessons are intended to introduce you to the editor and 
its more commonly-used commands. We have not covered all of the editor’s commands, but a 
selection of commands that should be sufficient to accomplish most of your editing tasks. You 
can find out more about the editor in the Ex Reference Manual, which is applicable to both 
ex and edit. The manual is available from the Computing Services Library, 218 Evans Hall. 
One way to become familiar with the manual is to begin by reading the description of com- 
mands that you already know. 


Using ex 


As you become more experienced with using the editor, you may still find that edit con- 
tinues to meet your needs. However, should you become interested in using ex, it is easy to 
switch. To begin an editing session with ex, use the name ex in your command instead of 
edit. 


Edit commands work the same way in ex, but the editing environment is somewhat 
different. You should be aware of a few differences that exist between the two versions of the 
editor. In edit, only the characters ‘““”, “$”, and “\’ have special meanings in searching the 
buffer or indicating characters to be changed by a substitute command. Several additional 
characters have special meanings in ex, as described in the Ex Reference Manual. Another 
feature of the edit environment prevents users from accidently entering two alternative modes 
of editing, open and visual, in which the editor behaves quite differently from normal com- 
mand mode. If you are using ex and the editor behaves strangely, you may have accidently 
entered open mode by typing “o”. Type the ESC key and then a “Q” to get out of open or 
visual mode and back into the regular editor command mode. The document An Introduction 
to Display Editing with Vi provides a full discussion of visual mode. 
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A Tutorial Introduction to the UNIX Text Editor 


Brian W. Kernighan 


Bell Laboratories 
Murray Hill, New Jersey 07974 


Introduction 


Ed is a “text editor”, that is, an interactive 
program for creating and modifying “text”, using 
directions provided by a user at a terminal. The 
text is often a document like this one, or a pro- 
gram or perhaps data for a program. 


This introduction is meant to simplify learn- 
ing ed. The recommended way to learn ed is to 
read this document, simultaneously using ed to 
follow the examples, then to read the description 
in section I of the UNIX Programmer’s Manual, 
all the while experimenting with ed. (Solicita- 
tion of advice from experienced users is also use- 
ful.) 


Do the exercises! They cover material not 
completely discussed in the actual text. An 
appendix summarizes the commands. 


Disclaimer 


This is an introduction and a tutorial. For 
this reason, no attempt is made to cover more 
than a part of the facilities that ed offers 
(although this fraction includes the most useful 
and frequently used parts). When you have 
mastered the Tutorial, try Advanced Editing on 
UNIX. Also, there is not enough space to 
explain basic UNIX procedures. We will assume 
that you know how to log on to UNIX, and that 
you have at least a vague understanding of what 
a file is. For more on that, read UNIX for 
Beginners. 


You must also know what character to type 
as the end-of-line on your particular terminal. 
This character is the RETURN key on most ter- 
minals. Throughout, we will refer to this charac- 
ter, whatever it is, as RETURN. 


Getting Started 


We'll assume that you have logged in to your 
system and it has just printed the prompt char- 
acter, usually either a $ or a %. The easiest way 
to get ed is to type 


ed (followed by a return) 


You are now ready to go — ed is waiting for you 
to tell it what to do. 


Creating Text — the Append command “a” 


As your first problem, suppose you want to 
create some text starting from scratch. Perhaps 
you are typing the very first draft of a paper; 
clearly it will have to start somewhere, and 
undergo modifications later. This section will 
show how to get some text in, just to get started. 
Later we’ll talk about how to change it. 


When ed is first started, it is rather like 
working with a blank piece of paper — there is 
no text or information present. This must be 
supplied by the person using ed; it is usually 
done by typing in the text, or by reading it into 
ed from a file. We will start by typing in some 
text, and return shortly to how to read files. 


First a bit of terminology. In ed jargon, the 
text being worked on is said to be “kept in a 
buffer.” Think of the buffer as a work space, if 
you like, or simply as the information that you 
are going to be editing. In effect the buffer is 
like the piece of paper, on which we will write 
things, then change some of them, and finally 
file the whole thing away for another day. 


The user tells ed what to do to his text by 
typing instructions called ‘commands.” Most 
commands consist of a single letter, which must 
be typed in lower case. Each command is typed 
on a separate line. (Sometimes the command is 
preceded by information about what line or lines 
of text are to be affected — we will discuss these 
shortly.) Ed makes no response to most com- 
mands — there is no prompting or typing of 
messages like “ready”. (This silence is preferred 
by experienced users, but sometimes a hangup 
for beginners.) 


The first command is append, written as the 
letter 
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a 


all by itself. It means “append (or add) text 
lines to the buffer, as I type them in.” Append- 
ing is rather like writing fresh material on a 
piece of paper. 


So to enter lines of text into the buffer, just 
type an a followed by a RETURN, followed by 
the lines of text you want, like this: 


a 
Now is the time 
for all good men 
to come to the aid of their party. 


The only way to stop appending is to type a 
line that contains only a period. The “.” is used 
to tell ed that you have finished appending. 
(Even experienced users forget that terminating 
“*.” sometimes. If ed seems to be ignoring you, 
type an extra line with just “.” on it. You may 
then find you’ve added some garbage lines to 
your text, which you'll have to take out later.) 


After the append command has been done, 
the buffer will contain the three lines 


Now is the time 
for all good men 
to come to the aid of their party. 


The “a” and “.” aren’t there, because they are 
not text. 


To add more text to what you already have, 
just issue another a command, and continue typ- 
ing. 


Error Messages — “?” 


If at any time you make an error in the com- 
mands you type to ed, it will tell you by typing 


ae 


This is about as cryptic as it can be, but with 
practice, you can usually figure out how you 
goofed. 


Writing text out as a file — the Write com- 
mand “w” 


It’s likely that you’ll want to save your text 
for later use. To write out the contents of the 
buffer onto a file, use the write command 


Ww 


followed by the filename you want to write on. 
This will copy the buffer’s contents onto the 
specified file (destroying any previous informa- 
tion on the file). To save the text on a file 
named junk, for example, type 


w junk 


Leave a space between w and the file name. Ed 
will respond by printing the number of charac- 
ters it wrote out. In this case, ed would respond 
with 


68 


(Remember that blanks and the return character 
at the end of each line are included in the char- 
acter count.) Writing a file just makes a copy of 
the text — the buffer’s contents are not dis- 
turbed, so you can go on adding lines to it. This 
is an important point. Ed at all times works on 
a copy of a file, not the file itself. No change in 
the contents of a file takes place until you give a 
w command. (Writing out the text onto a file 
from time to time as it is being created is a good 
idea, since if the system crashes or if you make 
some horrible mistake, you will lose all the text 
in the buffer but any text that was written onto 
a file is relatively safe.) 


Leaving ed — the Quit command “q” 


To terminate a session with ed, save the text 
you’re working on by writing it onto a file using 
the w command, and then type the command 


q 


which stands for quit. The system will respond 
with the prompt character ($ or %). At this 
point your buffer vanishes, with all its text, 
which is why you want to write it out before 
quitting.t 


Exercise 1: 
Enter ed and create some text using 


a 
... text... 


Write it out using w. Then leave ed with the q 
command, and print the file, to see that every- 
thing worked. (To print a file, say 


pr filename 
or 
cat filename 


in response to the prompt character. Try both.) 


+ Actually, ed will print ? if you try to quit without writ- 
ing. At that point, write if you want; if not, another q 
will get you out regardless. 
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Reading text from a file — the Edit com- 
mand “e” 


A common way to get text into the buffer is 
to read it from a file in the file system. This is 
what you do to edit text that you saved with the 
w command in a previous session. The edit 
command e fetches the entire contents of a file 
into the buffer. So if you had saved the three 
lines “Now is the time”, etc., with a w command 
in an earlier session, the ed command 


e junk 


would fetch the entire contents of the file junk 
into the buffer, and respond 


68 


which is the number of characters in junk. [f 
anything was already in the buffer, it is deleted 
first. 


If you use the e command to read a file into 
the buffer, then you need not use a file name 
after a subsequent w command; ed remembers 
the last file name used in an e command, and w 
will write on this file. Thus a good way to 
operate is 


ed 

e file 

{editing session] 
w 

q 


This way, you can simply say w from time to 
time, and be secure in the knowledge that if you 
got the file name right at the beginning, you are 
writing into the proper file each time. 


You can find out at any time what file name 
ed is remembering by typing the file command f. 
In this example, if you typed 


f 
ed would reply 
junk 


Reading text from a file — the Read com- 
mand “r” 


Sometimes you want to read a file into the 
buffer without destroying anything that is 
already there. This is done by the read com- 
mand r. The command 


r junk 


will read the file junk into the buffer; it adds it 
to the end of whatever is already in the buffer. 
So if you do a read after an edit: 


e junk 
r junk 


the buffer will contain two copies of the text (six 
lines). 


Now is the time 
for all good men 
to come to the aid of their party. 
Now is the time 
for all good men 
to come to the aid of their party. 


Like the w and e commands, r prints the 
number of characters read in, after the reading 
operation is complete. 


Generally speaking, r is much less used than 
e. 


Exercise 2: 


Experiment with the e command — try read- 
ing and printing various files. You may get an 
error ?name, where name is the name of a file; 
this means that the file doesn’t exist, typically 
because you spelled the file name wrong, or 
perhaps that you are not allowed to read or 
write it. Try alternately reading and appending 
to see that they work similarly. Verify that 


ed filename 


is exactly equivalent to 


ed 
e filename 


What does 
f filename 


do? 


Printing the contents of the buffer — the 
Print command “p” 


To print or list the contents of the buffer (or 
parts of it) on the terminal, use the print com- 
mand 


p 


The way this is done is as follows. Specify the 
lines where you want printing to begin and 
where you want it to end, separated by a 
comma, and followed by the letter p. Thus to 
print the first two lines of the buffer, for exam- 
ple, (that is, lines 1 through 2) say 


1,2p (starting line=1, ending line=2 p) 
Ed will respond with 


Now is the time 
for all good men 
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Suppose you want to print all the lines in 
the buffer. You could use 1,3p as above if you 
knew there were exactly 3 lines in the buffer. 
But in general, you don’t know how many there 
are, so what do you use for the ending line 
number? Ed provides a shorthand symbol for 
“line number of last line in buffer” — the dollar 
sign $. Use it this way: 

1,$p 


This will print all the lines in the buffer (line 1 
to last line.) If you want to stop the printing 
before it is finished, push the DEL or Delete key; 
ed will type 


ys 


and wait for the next command. 


To print the last line of the buffer, you could 
use 


$,$p 
but ed lets you abbreviate this to 


$p 
You can print any single line by typing the line 
number followed by a p. Thus 

Ip 
produces the response 

Now is the time 


which is the first line of the buffer. 


In fact, ed lets you abbreviate even further: 
you can print any single line by typing just the 
line number — no need to type the letter p. So 
if you say 


$ 
ed will print the last line of the buffer. 


You can also use $ in combinations like 
$—1,$p 


which prints the last two lines of the buffer. 
This helps when you want to see how far you got 
in typing. 


Exercise 3: 


As before, create some text using the a com- 
mand and experiment with the p command. 
You will find, for example, that you can’t print 
line 0 or a line beyond the end of the buffer, and 
that attempts to print a buffer in reverse order 
by saying 


3,1p 


don’t work. 


The current line — “Dot” or “.” 


Suppose your buffer still contains the six 
lines as above, that you have just typed 


1,3p 


and ed has printed the three lines for you. Try 
typing just 


p (no line numbers) 
This will print . 
to come to the aid of their party. 


which is the third line of the buffer. In fact it is 
the last (most recent) line that you have done 
anything with. (You just printed it!) You can 
repeat this p command without line numbers, 
and it will continue to print line 3. 


The reason is that ed maintains a record of 
the last line that you did anything to (in this 
case, line 3, which you just printed) so that it 
can be used instead of an explicit line number. 
This most recent line is referred to by the short- 
hand symbol 


(pronounced “dot”’). 


Dot is a line number in the same way that $ is; 
it means exactly “the current line”, or loosely, 
“the line you most recently did something to.” 
You can use it in several ways — one possibility 
is to say 


0) 9) 


This will print all the lines from (including) the 
current line to the end of the buffer. In our 
example these are lines 3 through 6. 


Some commands change the value of dot, 
while others do not. The p command sets dot to 
the number of the last line printed; the last com- 
mand will set both . and §$ to 6. 


Dot is most useful when used in combina- 
tions like this one: 


-+1 (or equivalently, .+1p) 


This means “print the next line” and is a handy 
way to step slowly through a buffer. You can 
also say 


.-l (or .—lp ) 


which means “print the line before the current 
line.” This enables you to go backwards if you 
wish. Another useful one is something like 


.—3,.—lp 
which prints the previous three lines. 


Don’t forget that all of these change the 
value of dot. You can find out what dot is at 
any time by typing 
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Ed will respond by printing the value of dot. 


Let’s summarize some things about the p 
command and dot. Essentially p can be pre- 
ceded by 0, 1, or 2 line numbers. If there is no 
line number given, it prints the “current line”, 
the line that dot refers to. If there is one line 
number given (with or without the letter p), it 
prints that line (and dot is set there); and if 
there are two line numbers, it prints all the lines 
in that range (and sets dot to the last line 
printed.) If two line numbers are specified the 
first can’t be bigger than the second (see Exer- 
cise 2.) 


Typing a single return will cause printing of 
the next line — it’s equivalent to .t1p. Try it. 
Try typing a —; you will find that it’s equivalent 
to .—Ip. 


Deleting lines: the “d” command 


Suppose you want to get rid of the three 
extra lines in the buffer. This is done by the 
delete command 


d 


Except that d deletes lines instead of printing 
them, its action is similar to that of p. The lines 
to be deleted are specified for d exactly as they 
are for p: 


starting line, ending line d 
Thus the command 
4,$d 


deletes lines 4 through the end. There are now 
three lines left, as you can check by using 


1,$p 


And notice that $ now is line 3! Dot is set to 
the next line after the last line deleted, unless 
the last line deleted is the last line in the buffer. 
In that case, dot is set to $. 


Exercise 4: 


Experiment with a, e, r, w, p and d until 
you are sure that you know what they do, and 
until you understand how dot, $, and line 
numbers are used. 


If you are adventurous, try using line 
numbers with a, r and w as well. You will find 
that a will append lines after the line number 
that you specify (rather than after dot); that r 
reads a file in after the line number you specify 
(not necessarily at the end of the buffer); and 
that w will write out exactly the lines you 
specify, not necessarily the whole buffer. These 
variations are sometimes handy. For instance 


you can insert a file at the beginning of a buffer 
by saying 


Or filename 


and you can enter lines at the beginning of the 
buffer by saying 


Oa 
... text... 


Notice that .w is very different from 


Modifying text: the Substitute command 


6699 


Ss 


We are now ready to try one of the most 
important of all commands — the substitute 
command 


8 


This is the command that is used to change indi- 
vidual words or letters within a line or group of 
lines. It is what you use, for example, for 
correcting spelling mistakes and typing errors. 


Suppose that by a typing error, line 1 says 
Now is th time 


— the e has been left off the. You can use s to 
fix this up as follows: 


1s/th/the/ 


This says: “in line 1, substitute for the charac- 
ters th the characters the.” To verify that it 
works (ed will not print the result automatically) 
say 


p 
and get 
Now is the time 


which is what you wanted. Notice that dot must 
have been set to the line where the substitution 
took place, since the p command printed that 
line. Dot is always set this way with the s com- 
mand. 


The general way to use the substitute com- 
mand is 


starting-line, ending-line s/change this/to this/ 


Whatever string of characters is between the 
first pair of slashes is replaced by whatever is 
between the second pair, in all the lines between 
starting-line and ending-line. Only the first 
occurrence on each line is changed, however. If 
you want to change every occurrence, see Exer- 
cise 5. The rules for line numbers are the same 
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as those for p, except that dot is set to the last 
line changed. (But there is a trap for the 
unwary: if no substitution took place, dot is not 
changed. This causes an error ? as a warning.) 


Thus you can say 
1,$s/speling/spelling/ 


and correct the first spelling mistake on each 
line in the text. (This is useful for people who 
are consistent misspellers!) 


If no line numbers are given, the s command 
assumes we mean “make the substitution on line 
dot”, so it changes things only on the current 
line. This leads to the very common sequence 


s/something/something else/p 


which makes some correction on the current line, 
and then prints it, to make sure it worked out 
right. If it didn’t, you can try again. (Notice 
that there is a p on the same line as the s com- 
mand. With few exceptions, p can follow any 
command; no other multi-command lines are 
legal.) 


It’s also legal to say 
s/...// 


which means “change the first string of charac- 
ters to “nothing”’, i.e., remove them. This is use- 
ful for deleting extra words in a line or removing 


extra letters from words. For instance, if you 
had 


Nowxx is the time 
you can say 

s/xx//p 
to get 

Now is the time 


Notice that // (two adjacent slashes) means “no 
characters”, not a blank. There is a difference! 
(See below for another meaning of //.) 


Exercise 5: 


Experiment with the substitute command. 
See what happens if you substitute for some 
word on a line with several occurrences of that 
word. For example, do this: 


a 
the other side of the coin 


s/the/on the/p 
You will get 
on the other side of the coin 


A substitute command changes only the first 
occurrence of the first string. You can change 


all occurrences by adding a g (for “global’”) to 
the s command, like this: 


s/.../.../gp 


Try other characters instead of slashes to delimit 
the two sets of characters in the s command — 
anything should work except blanks or tabs. 


(If you get funny results using any of the 
characters 


~~. $ — * \& 


read the section on “Special Characters’’.) 


Context searching — “/.../” 


With the substitute command mastered, you 
can move on to another highly important idea of 
ed — context searching. 


Suppose you have the original three line text 
in the buffer: 


Now is the time 
for all good men 
to come to the aid of their party. 


Suppose you want to find the line that contains 
their so you can change it to the. Now with 
only three lines in the buffer, it’s pretty easy to 
keep track of what line the word their is on. 
But if the buffer contained several hundred 
lines, and you’d been making changes, deleting 
and rearranging lines, and so on, you would no 
longer really know what this line number would 
be. Context searching is simply a method of 
specifying the desired line, regardless of what its 
number is, by specifying some context on it. 


The way to say “search for a line that con- 
tains this particular string of characters” is to 
type 


/string of characters we want to find/ 
For example, the ed command 
/their/ 


is a context search which is sufficient to find the 
desired line — it will locate the next occurrence 
of the characters between slashes (“their”). It 
also sets dot to that line and prints the line for 
verification: 


to come to the aid of their party. 


“Next occurrence” means that ed starts looking 
for the string at line .+1, searches to the end of 
the buffer, then continues at line 1 and searches 
to line dot. (That is, the search “wraps around” 
from $ to 1.) It scans all the lines in the buffer 
until it either finds the desired line or gets back 
to dot again. If the given string of characters 
can’t be found in any line, ed types the error 
message 
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? 


Otherwise it prints the line it found. 


You can do both the search for the desired 
line and a substitution all at once, like this: 


/their/s/their/the/p 
which will yield 
to come to the aid of the party. 


There were three parts to that last command: 
context search for the desired line, make the 
substitution, print the line. 


The expression /their/ is a context search 
expression. In their simplest form, all context 
search expressions are like this — a string of 
characters surrounded by slashes. Context 
searches are interchangeable with line numbers, 
so they can be used by themselves to find and 
print a desired line, or as line numbers for some 
other command, like s. They were used both 
ways in the examples above. 


Suppose the buffer contains the three fami- 
liar lines 


Now is the time 
for all good men 
to come to the aid of their party. 


Then the ed line numbers 


/Now/+1 
/good/ 
/party/—1 


are all context search expressions, and they all 
refer to the same line (line 2). To make a 
change in line 2, you could say 


/Now/+1s/good/bad/ 
or 

/good/s/good/bad/ 
or 

/party/—1s/good/bad/ 


The choice is dictated only by convenience. You 
could print all three lines by, for instance 


/Now/,/party/p 
or 
/Now/,/Now/+2p 


or by any number of similar combinations. The 
first one of these might be better if you don’t 
know how many lines are involved. (Of course, 
if there were only three lines in the buffer, you’d 
use 


1,$p 


but not if there were several hundred.) 


The basic rule is: a context search expression 
is the same as a line number, so it can be used 
wherever a line number is needed. 


Exercise 6: 


Experiment with context searching. Try a 
body of text with several occurrences of the 
same string of characters, and scan through it 
using the same context search. 


Try using context searches as line numbers 
for the substitute, print and delete commands. 
(They can also be used with r, w, and a.) 


Try context searching using ?text? instead 
of /text/. This scans lines in the buffer in 
reverse order rather than normal. This is some- 
times useful if you go too far while looking for 
some string of characters — it’s an easy way to 
back up. 


(If you get funny results with any of the 
characters 
Taos fo * \ Oe 
read the section on “Special Characters’’.) 


Ed provides a shorthand for repeating a con- 
text search for the same string. For example, 
the ed line number 


/string/ 


will find the next occurrence of string. It often 
happens that this is not the desired line, so the 
search must be repeated. This can be done by 
typing merely 


// 


This shorthand stands for “the most recently 
used context search expression.” It can also be 
used as the first string of the substitute com- 
mand, as in 


/string1/s//string2/ 


which will find the next occurrence of string] 
and replace it by string2. This can save a lot 
of typing. Similarly 


2? 


means “scan backwards for the same expres- 
sion.” 


66,99 66399 
c 1 


Change and Insert — and 
This section discusses the change command 
c 


which is used: to change or replace a group of 
one or more lines, and the insert command 
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i 
which is used for inserting a group of one or 
more lines. 


“Change”, written as 
c 


is used to replace a number of lines with 
different lines, which are typed in at the termi- 
nal. For example, to change lines .+1 through $ 
to something else, type 


.+1,$c 


... type the lines of text you want here... 


The lines you type between the ¢ command and 
the . will take the place of the original lines 
between start line and end line. This is most 
useful in replacing a line or several lines which 
have errors in them. 


If only one line is specified in the ¢ com- 
mand, then just that line is replaced. (You can 
type in as many replacement lines as you like.) 
Notice the use of . to end the input — this 
works just like the . in the append command 
and must appear by itself on a new line. If no 
line number is given, line dot is replaced. The 
value of dot is set to the last line you typed in. 


“Insert” is similar to append — for instance 


/string/i 
... type the lines to be inserted here... 


will insert the given text before the next line 
that contains “string”. The text between i and. 
is inserted before the specified line. If no line 
number is specified dot is used. Dot is set to the 
last line inserted. 


Exercise 7: 


“Change” is rather like a combination of 
delete followed by insert. Experiment to verify 
that 


start, end d 
i 
... text... 


is almost the same as 


start, end c 
oa FAKE we 


These are not precisely the same if line $ gets 
deleted. Check this out. What is dot? 


Experiment with a and i, to see that they are 
similar, but not the same. You will observe that 


line-number a 
wane bOXE wc 3 


appends after the given line, while 


line-number i 
wig et bOXE eke 


inserts before it. Observe that if no line number 
is given, i inserts before line dot, while a 
appends after line dot. 


Moving text around: the “m” command 


The move command m is used for cutting 
and pasting — it lets you move a group of lines 
from one place to another in the buffer. Sup- 
pose you want to put the first three lines of the 
buffer at the end instead. You could do it by 
saying: 

1,3w temp 
$r temp 
1,3d 


(Do you see why?) but you can do it a lot easier 
with the m command: 


1,3m$ 
The general case is 
start line, end line m after this line 


Notice that there is a third line to be specified — 
the place where the moved stuff gets put. Of 
course the lines to be moved can be specified by 
context searches; if you had 


First paragraph 
end of first paragraph. 
Second paragraph 
end of second paragraph. 
you could reverse the two paragraphs like this: 
/Second/,/end of second/m/First/—1 


Notice the —1: the moved text goes after the 
line mentioned. Dot gets set to the last line 
moved. 


The global commands “g” and “v” 


The global command g is used to execute 
one or more ed commands on all those lines in 
the buffer that match some specified string. For 
example 


g/peling/p 


prints all lines that contain peling. More use- 
fully, 
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g/peling/s//pelling/gp 


makes the substitution everywhere on the line, 
then prints each corrected line. Compare this to 


1,$s/peling/pelling/gp 


which only prints the last line substituted. 
Another subtle difference is that the g comrnand 
does not give a? if peling is not found where 
the s command will. 


There may be several commands (including 
a, c, i, r, w, but not g); in that case, every line 
except the last must end with a backslash x 


g/xxx/.-1s/abc/def/n 
.+2s8/ghi/jkl/n 
-2,.p 


makes changes in the lines before and after each 
line that contains xxx, then prints all three 
lines. 


The v command is the same as g, except 
that the commands are executed on every line 
that does not match the string following v: 


v/ /d 


deletes every line that does not contain a blank. 


Special Characters 


You may have noticed that things just don’t 
work right when you used some characters like ., 
*  $, and others in context searches and the sub- 
stitute command. The reason is rather complex, 
although the cure is simple. Basically, ed treats 
these characters as special, with special mean- 
ings. For instance, in a context search or the 
first string of the substitute command only, . 
means “any character,” not a period, so 


/x.y/ 


means “a line with an x, any character, and a 
y,” not just “a line with an x, a period, and a 
y.” A complete list of the special characters that 
can cause trouble is the following: 


a ; $ [ * \ 
Warning: The backslash character \is special to 
ed. For safety’s sake, avoid it where possible. If 
you have to use one of the special characters in a 
substitute command, you can turn off its magic 


meaning temporarily by preceding it with the 
backslash. Thus 


s/\\\.\*/backslash dot star/ 
will change \.* into ‘‘backslash dot star’’. 


Here is a hurried synopsis of the other spe- 
cial characters. First, the circumflex * signifies 
the beginning of a line. Thus 


/‘string/ 
finds string only if it is at the beginning of a 
line: it will find 

string 
but not 

the string... 


The dollar-sign $ is just the opposite of the 
circumflex; it means the end of a line: 


/string$/ 


will only find an occurrence of string that is at 
the end of some line. This implies, of course, 
that 


/*string$/ 


will find only a line that contains just string, 
and 


/*.§/ 
finds a line containing exactly one character. 


The character ., as we mentioned above, 
matches anything; 


/x.y/ 
matches any of 


xt+y 
X-y 
xy 
x.y 


This is useful in conjunction with *, which is a 
repetition character; a* is a shorthand for “any 
number of a’s,” so .* matches any number of 
anythings. This is used like this: 


s/.*/stuft/ 
which changes an entire line, or 
s/.*,// 


which deletes all characters in the line up to and 
including the last comma. (Since .* finds the 
longest possible match, this goes up to the last 
comma.) 


[ is used with ] to form “character classes”; 
for example, 


/(0123456789]/ 


matches any single digit — any one of the char- 
acters inside the braces will cause a match. This 
can be abbreviated to [0O—9]. 


Finally, the & is another shorthand character 
— it is used only on the right-hand part of a 
substitute command where it means “whatever 
was matched on the left-hand side”. It is used 
to save typing. Suppose the current line con- 
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tained 
Now is the time 


and you wanted to put parentheses around it. 
You could just retype the line, but this is tedi- 
ous. Or you could say 


s/*/(/ 
s/$/)/ 


using your knowledge of * and $. But the easiest 
way uses the &: 


s/.*/(&)/ 


This says “match the whole line, and replace it 
by itself surrounded by parentheses.” The & can 
be used several times in a line; consider using 


8/.*/8&? &!!/ 
to produce 


Now is the time? Now is the time!! 


You don’t have to match the whole line, of 
course: if the buffer contains 
the end of the world 
you could type 
/world/s//& is at hand/ 
to produce 


the end of the world is at hand 


trates how to take advantage of ed to save typ- 
ing. The string /world/ found the desired line; 
the shorthand // found the same word in the 
line; and the & saves you from typing it again. 


Observe this ate ada carefully, for it illus- 


The & is a special character only within the 
replacement text of a substitute command, and 
has no special meaning elsewhere. You can turn 
off the special meaning of & by preceding it with 
a 

s/ampersand /\&/ 


will convert the word ‘“ampersand” into the 
literal symbol & in the current line. 


Summary of Commands’ and _ Line 


Numbers 


The general form of ed commands is the 
command name, perhaps preceded by one or two 
line numbers, and, in the case of e, r, and w, 
followed by a file name. Only one command is 
allowed per line, but a p command may follow 
any other command (except for e, r, w and q). 


a: Append, that is, add lines to the buffer (at 
line dot, unless a different line is specified). 
Appending continues until . is typed on a new 
line. Dot is set to the last line appended. 


ec: Change the specified lines to the new text 
which follows. The new lines are terminated by 
a ., as with a. If no lines are specified, replace 
line dot. Dot is set to last line changed. 


d: Delete the lines specified. If none are 
specified, delete line dot. Dot is set to the first 
undeleted line, unless $ is deleted, in which case 
dot is set to $. 


e: Edit new file. Any previous contents of the 
buffer are thrown away, so issue a w before- 
hand. 


f: Print remembered filename. If a name follows 
f the remembered name will be set to it. 


g: The command 
g/---/commands 


will execute the commands on those lines that 
contain ---, which can be any context search 
expression. 


i: Insert lines before specified line (or dot) until 
a. is typed on a new line. Dot is set to last line 
inserted. 


m: Move lines specified to after the line named 
after m. Dot is set to the last line moved. 


p: Print specified lines. If none specified, print 
line dot. A single line number is equivalent to 
line-number p. A single return prints .+1, the 
next line. 


q: Quit ed. Wipes out all text in buffer if you 
give it twice in a row without first giving a w 
command. 


r: Read a file into buffer (at end unless specified 
elsewhere.) Dot set to last line read. 


s: The command 
s/string1/string2/ 


substitutes the characters string] into string2 
in the specified lines. If no lines are specified, 
make the substitution in line dot. Dot is set to 
last line in which a substitution took place, 
which means that if no substitution took place, 
dot is not changed. s changes only the first 
occurrence of string1 on a line; to change all of 
them, type a g after the final slash. 


v: The command 
v/---/commands 


executes commands on those lines that do not 
contain ---. 


w: Write out buffer onto a file. Dot is not 
changed. 
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=: Print value of dot. (= by itself prints the 
value of $.) 


!: The line 
!command-line 


causes command-line to be executed as a 
UNIX command. 


/----- /: Context search. Search for next line 
which contains this string of characters. Print 
it. Dot is set to the line where string was found. 
Search starts at .+1, wraps around from § to 1, 
and continues to dot, if necessary. 


2----- ?: Context search in reverse direction. 
Start search at .—1, scan to 1, wrap around to §. 
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Advanced Editing on UNIX 


Brian W. Kernighan 


Bell Laboratories 
Murray Hill, New Jersey 07974 


1. INTRODUCTION 


Although UNIXft provides remarkably 
effective tools for text editing, that by itself is no 
guarantee that everyone will automatically make 
the most effective use of them. In particular, 
people who are not computer specialists — typ- 
ists, secretaries, casual users — often use the 
system less effectively than they might. 


This document is intended as a sequel! to 4 
Tutoriai Introduction to the UNIX Text Editor [1], 
providing explanations and examples of how to 
edit with less effort. (You should also be fami- 
liar with the material in UNIX For Beginners {2].) 
Further information on all commands discussed 
here can be found in The UNIX Programmer's 
Manual {3}. 


Examples are based on observations of 
users and the difficulties they encounter. Topics 
covered include special characters in searches 
and substitute commands, line addressing, the 
global commands, and line moving and copying. 
There are also brief discussions of effective use 
of related tools, like those for file manipulation, 
and those based on ed, like grep and sed. 


A word of caution. There is only one way 
to learn to use something, and that Is to use it. 
Reading a description is no substitute for trying 
something. A paper like this one should give 
you ideas about what to try, but until you actu- 
ally try something, you will not learn it. 


2. SPECIAL CHARACTERS 


The editor ed is the primary interface to 
the system for many people, so it is worthwhile 
to know how to get the most out of ed for the 
least effort. 


The next few sections will discuss 
shortcuts and labor-saving devices. Not all of 
these will be instantly useful to any one person, 
of course, but a few will be, and the others 
should give you ideas to store away for future 
use. And as always, until you try these things, 


tUNIX is a Trademark of Bell Laboratories. 


they will remain theoretical knowledge, not 
something you have confidence in. 


The List command ‘I’ 


ed provides two commands for printing the 
contents of the lines you're editing. Most people 
are familiar with p, in combinations like 


1,Sp 
to print all the lines you're editing, or 
s/abc/def/p 


to change ‘abc’ to ‘def’ on the current line. Less 
familiar is the fist command I (the letter ‘/°), 
which gives slightly more information than p. In 
particular, | makes visible characters that are 
normally invisible, such as tabs and backspaces. 
If you list a line that contains some of these, |! 
will print each tab as > and each backspace as 
<. This makes it much easier to correct the sort 
of typing mistake that inserts extra spaces adja- 
cent to tabs, or inserts a backspace followed by a 
space. 


The ! command also ‘folds’ long lines for 
printing — any line that exceeds 72 characters is 
printed on multiple lines: each printed line 
except the last is terminated by a backslash \, so 
you can tell it was folded. This is useful for 
printing long lines on short terminals. 


Occasionally the | command will print in a 
line a string of numbers preceded by a backslash, 
such as \07 or \16. These combinations are used 
to make visible characters that normally don't 
print, like form feed or vertical tab or bell. Each 
such combination is a single character. When 
you see such characters, be wary — they may 
have surprising meanings when printed on some 
terminals. Often their presence means that your 
finger slipped while you were (typing, you almost 
never want them. 


The Substitute Command ‘s’ 


Most of the next few sections will be taken 
up with a discussion of the substitute command 
s. Since this is the command for changing the 
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contents of individual lines, it probably has the 
most complexity of any ed command, and the 
most potential for effective use. 


As the simplest place to begin, recall the 
meaning of a trailing g after a substitute com- 
mand. With 


s/this/that/ 
and 
s/this/that/g 


the first one replaces the first ‘this’ on the line 
with ‘that’. If there is more than one ‘this’ on 
the line, the second form with the trailing ¢g 
changes a// of them. . 


Either form of the s command can be fol- 
lowed by por I to ‘print’ or ‘list’ (as described in 
the previous section) the contents of the line: 


s/this/that/p 
s/this/that/1 
s/this/that/gp 
s/this/that/gl 


are all legal, and mean slightly different things. 
Make sure you know what the differences are. 


Of course, any s command can be pre- 
ceded by one or two ‘line numbers’ to specify 
that the substitution is to take place on a group 
of lines. Thus 


1,$s/mispell/misspell/ 


changes the first occurrence of ‘mispell’ to 
‘misspell’ on every line of the file. But 


1,$s/mispell/misspell/g 


changes every occurrence in every line (and this 
is more likely to be what you wanted in this par- 
ticular case). 


You should also notice that if you add a p 
or | to the end of any of these substitute com- 
mands, only the last line that got changed will be 
printed, not all the lines. We will talk later about 
how to print all the lines that were modified. 


The Undo Command ‘u’ 


Occasionally you will make a substitution 
in a line, only to realize too late that it was a 
ghastly mistake. The ‘undo’ command u lets 
you ‘undo’ the-last substitution: the last line that 
was substituted can be restored to its previous 
state by typing the command 


u 


The Metacharacter °.’ 


As you have undoubtedly noticed when 
you use ed, certain characters have unexpected 
meanings when they occur in the left side of a 
substitute command, or in a search for a particu- 
lar line. In the next several sections, we will talk 
about these special characters, which are often 
called ‘metacharacters’. 


The first one is the period ‘*.°. On the left 
side of a substitute command, or in a search with 
‘/.../°, ‘.” stands for any single character. Thus 
the search 


/x.y/ 


finds any line where ‘x’ and ‘y’ occur separated 
by a single character, as in 


x+y 
x—7y 
xoy 
X.y 


and so on. (We will use o to stand for a space 
whenever we need to make it visible.) 


& 


Since ‘*.’ matches a single character, that 
gives you a way to deal with funny characters 
printed by |. Suppose you have a line that, when 
printed with the | command, appears as 


th\07is 


and you want to get rid of the \07 (which 
represents the bell character, by the way). 


The most obvious solution-is to try 
s/\07// 


but this will fail. (Try it.) The brute force solu- 
tion, which most people would now take, is to 
re-type the entire line. This is guaranteed, and is 
actually quite a reasonable tactic if the line im 
question isn’t too big, but for a very long line, 
re-typing is a bore. This is where the metachar- 


acter ‘.. comes in handy. Since ‘\07" really 
represents a single character, if we say 


s/th.is/this/ 


the job is done. The ‘.’ matches the mysterious 
character between the ‘h’ and the ‘i’, whatever i: 
IS. 


. 9 


Bear in mind that since *.’ matches any 
single character, the command 


s/./,/ 


an) 


converts the first character on a line into a ‘,’, 
which very often is not what you intended. 


As is true of many characters in ed, the ‘.’ 
has several meanings, depending on its context. 
This line shows all three: 


S/././ 


The first ‘.’ is a line number, the number of the 
line we are editing, which is called ‘line dot’. 
(We will discuss line dot more in Section 3.) The 
second ‘. is a metacharacter that matches any 
single character on that line. The third ‘.’ is the 
only one that really is an honest literal period. 
On the right side of a substitution, ‘.’ is not spe- 
cial. If you apply this command to the line 


Now is the time. 
the result will be 
-ow is the time. 


which is probably not what you intended. 


The Backslash ‘\’ 


Since a period means ‘any character’, the 
question naturally arises of what to do when you 
really want a period. For example, how do you 
convert the line 


Now is the time. 
into 
Now is the time? 


The backslash ‘\’ does the job. A backslash 
turns off any special meaning that the next char- 
acter might have; in particular, ‘\.° converts the 
‘.’ from a ‘match anything’ into a period, so you 
can use it to replace the period in 


Now is the time. 
like this: 

s/\./2/ 
The pair of characters ‘\.’ is considered by ed to 
be a single real period. 


The backslash can also be used when 
searching for lines that contain a special charac- 
ter. Suppose you are looking for a line that con- 
tains 


.PP 

The search 
/ .PP/ 

isn’t adequate, for it will find a line like 
THE APPLICATION OF ... 


because the ‘.’ matches the letter ‘A’. But if you 
say 


/\ .PP/ 


you will find only lines that contain *.PP’. 
The backslash can also be used to turn off 


.? 


special meanings for characters other than °.’. 
For exampie, consider finding a line that con- 
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tains a backslash. The search 
/\/ 


won't work, because the ‘\’ isn’t a literal ‘\’, but 
instead means that the second ‘/’ no longer 
delimits the search. But by preceding a backslash 
with another one, you can search for a literal 
backslash. Thus 


/\\/ 


does work. Similarly, you can search for a for- 
ward slash ‘/° with 


IMI 


The backslash turns off the meaning of the 
immediately following ‘/’ so that it doesn’t ter- 
minate the /.../ construction prematurely. 


As an exercise, before reading further, 
find two substitute commands each of which will 
convert the line 


\x\.\y 
into the line 


\x\y 


Here are several solutions; verify that each 
works as advertised. 


s/\\\.// 
S/xX../x/ 
s/..y/y/ 


A couple of miscellaneous notes about 
backslashes and special characters. First, you 
can use any character to delimit the pieces of an 
s command: there is nothing sacred about 
slashes. (But you must use slashes for context 
searching.) For instance, in a line that contains a 
lot of slashes already, like 


/lexec //sys.fort.go // etc... 


you could use a colon as the delimiter — to 
delete all the slashes, type 


$:/ 1g 


Second, if # and @ are your character 
erase and line kill characters, you have to type 
\# and \@; this is true whether you’re talking to 
ed or any other program. 


When you are adding text with aor ior c, 
backslash is not special, and you should only put 
in one backslash for each one you really want. 


The Doilar Sign ‘S$’ 


The next metacharacter, the ‘S', stands for 
‘the end of the line’. As its most obvious use, 
suppose you have the line 
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Now is the 


and you wish to add the word ‘time’ to the end. 
Use the §$ like this: 


s/$/-time/ 
to get 
Now is the time 


Notice that a space is needed before ‘time’ in the 
substitute command, or you will get 


Now is thetime 


As another example, replace the second 
comma in the following line with a period 
without altering the first: 


Now is the time, for all good men, 
The command needed ts 
s/.S/./ 


The §$ sign here provides context to make specific 
which comma we mean. Without it. of course. 
the s command would operate on the first 
comma to produce 


Now is the time. for all good men, 


As another example, to convert 

Now is the time. 
into 

Now ts the time? 
as we did earlier, we can use 

s/.$/?/ 

Like ‘*.’, the ‘$° has multiple meanings 
depending on context. In the line 

$s/$/S/ 


the first ‘S’ refers to the last line of the file, the 
second refers to the end of that line, and the 
third is a literal dollar sign, to be added to that 
line. 


The Circumflex ‘*~’ 


The circumflex (or hat or caret) *~* stands 
for the beginning of the line. For example, sup- 
pose you are looking for a line that begins with 
‘the’. If you stmply say 


/the/ 


you will in all likelihood find several lines that 
contain ‘the’ in the middle before arriving at the 
one you want. But with 


/"the/ 


you narrow the context, and thus arrive at the 
desired one more easily. 


The other use of *** is of course to enable 
you to insert something at the beginning of a 
line: 


s/*/q/ 


places a space at the beginning of the current 
line. 


Metacharacters can be combined. To 
search for a line that contains o/v the characters 


-PP 
you can use the command 
/*\.PPS/ 


The Star ‘*° 


Suppose you have a line that looks like 
this: 


1ext X y lent 


where sex stands for lots of text, and there are 
some indeterminate number of spaces between 
the x and the y. Suppose the Job is to replace all 
the spaces between x and y by a single space. 
The line is too long to retype, and there are too 
many spaces to count. What now? 


e.¢ 


This is where the metacharacter ‘+’ comes 
in handy. A character followed by a star stands 
for aS many consecutive occurrences of that 
character as possible. To refer to all the spaces 
at once, say 


s/xo*y/xoy/ 


ry 


The construction ‘q*° means ‘as many spaces as 
possible’. Thus ‘xa*y’ means ‘an x, aS many 
Spaces as possible, then a y’. 

The star can be used with any character, 


not just space. If the original example was 
instead 


then all ‘—° signs can be replaced by a single 
space with the command 


s/x—+*y/xcy/ 


Finally, suppose that the line was 
1eNXT X ecccccccvcccccceeey tent 


Can you see what trap lies in wait for the 
unwary”? If you blindly type 


s/x.ty/xay/ 


what will happen? The answer, naturally, is that 
it depends. If there are no other x's or y's on 
the line. then everything works, but it’s blind 
luck, not good management. Remember that °. 
matches avy single character? Then *.« matches 
aS many single characters as possible, and unless 


you're careful, it can eat up a lot more of the 
line than you expected. If the line was, for 
example, like this: 


(ext x ext Keccccccccccccseey (ext y text 
then saying 
s/x.*y/xoy/ 


will take everything from the first ‘x’ to the /asr 
‘y’, which, in this example, is undoubtedly more 
than you wanted. 


The solution, of course, is to turn off the 
special meaning of *.” with ‘\.’: 
s/x\.*y/xay/ 
Now everything works, for ‘\.«’ means ‘as many 
periods as possible’. 
There are times when the pattern ‘.«’ is 
exactly what you want. For example, to change 
Now is the time for all good men .... 
into 
Now is the time. 
use ‘.*° to eat up everything after the ‘for’: 
s/afor.e/./ 


There are a couple of additional pitfalls 
associated with ‘+’ that you should be aware of. 
Most notable is the fact that ‘as many as possi- 
ble’ means zero or more. The fact that zero is a 
legitimate possibility is sometimes rather surpris- 
ing. For example, if our line contained 


lext Xy ext x y lext 
and we said 
s/xaty/xay/ 


the first ‘xy’ matches this pattern, for it consists 
of an ‘x’, zero spaces, and a ‘y’. The result is 
that the substitute acts on the first ‘xy’, and does 
not touch the later one that actually contains 
some intervening spaces. 


The way around this, if it matters, is to 
specify a pattern like 

/xacey/ 
which says ‘an x, a space, then as many more 


spaces as possible, then a y’, in other words, one 
or more spaces. 


The other startling behavior of ‘*' is again 
related to the fact that zero is a legitimate 
number of occurrences of something followed by 
a star. The command 


s/x*/v/g 


when applied to the line 
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abcdef 
produces 
yaybycydyeyfy 


which is almost certainly not what was intended. 
The reason for this behavior is that zero is a 
legal number of matches, and there are no x's at 
the beginning of the line (so that gets converted 
into a ‘y’), nor between the ‘a’ and the ‘b’ (so 
that gets converted into a ‘y’), nor... and so on. 
Make sure you really want zero matches; if not, 
in this case write 


s/xxe/y/g 


*xx#’ is one or more x's. 


The Brackets ‘{ |’ 


Suppose that you want to delete any 
numbers that appear at the beginning of all lines 
of a file. You might first think of trying a series 
of commands like 


1,$s/*1e// 
1,Ss/°2¢// 
1,$s/°3*// 


and so on, but this is clearly going to take for- 
ever if the numbers are at all long. Unless you 
want to repeat the commands over and over until 
finally all numbers are gone, you must get all the 
digits on one pass. This is the purpose of the 
brackets [ and ]. 


The construction 
[0123456789] 


matches any single digit — the whole thing ts 
called a ‘character class’. With a character class, 
the job is easy. The pattern ‘(0123456789]«" 
matches zero or more digits (an entire number), 
SO 


1,$s/* [0123456789] =// 


deletes all digits from the beginning of all lines. 


Any characters can appear within a charac- 
ter class, and just to confuse the issue there are 
essentially no special characters inside the brack- 
ets; even the backslash doesn't have a special 
meaning. To search for special characters, for 
example, you can say 


/{AS*O/ 
Within [...], the ‘{" is not special. To get a ‘]’ 
into a character class. make it the first character. 


It's a nuisance to have to spell out the 
digits, so you can abbreviate them as [0-9]: 
similarly, {a—z] stands for the lower case letters, 
and [A—Z] for upper case. 


As a final frill on character classes, you can 
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specify a class that means ‘none of the following 
characters’. This is done by beginning the class 
with a °**: 


(“O—9] 
stands for ‘any character except a digit’. Thus 


you might find the first line that doesn’t begin 
with a tab or space by a search like 


/“C (space) (tab) ]/ 


Within a character class, the circumflex has 
a special meaning only if it occurs at the begin- 
ning. Just to convince yourself, verify that 


UV 
finds a line that doesn’t begin with a circumflex. 


The Ampersand ‘&’ 


The ampersand ‘&’ is used primarily to 
save typing. Suppose you have the line 


Now is the time 
and you want to make it 
Now is the best time 
Of course you can always say 
s/the/the best/ 


but it seems silly to have to repeat the ‘the’. 
The ‘&’ is used to eliminate the repetition. On 
the right side of a substitute, the ampersand 
means ‘whatever was just matched’, so you can 
say 


s/the/& best/ 


and the ‘&’ will stand for ‘the’. Of course this 
isn’t much of a saving if the thing matched is 
just ‘the’, but if it is something truly long or 
awful, or if it is something like ‘.*’ which 
matches a lot of text, you can save some tedious 
typing. There is also much less chance of mak- 
ing a typing error in the replacement text. For 
example, to parenthesize a line, regardless of its 
length, 


s/.#/(&)/ 


The ampersand can occur more than once 
on the right side: 


s/the/& best and & worst/ 
makes 

Now is the best and the worst time 
and 

s/e/&? &!/ 


converts the original line into 


Now is the time? Now is the time!! 


To get a literal ampersand, naturally the 
backslash is used to turn off the special meaning: 


s/ampersand/\&/ 


converts the word into the symbol. Notice that 
‘&’ is not special on the left side of a substitute, 
only on the right side. 


Substituting Newlines 


ed provides a facility for splitting a single 
line into two or more shorter lines by ‘substitut- 
ing in a newline’. As the simplest example, sup- 
pose a line has gotten unmanageably long 
because of editing (or merely because it was 
unwisely typed). If it looks like 


text’ xy text 


you can break it between the ‘x* and the ‘y’ like 
this: 


s/xy/x\ 
y/ 


This is actually a single command, although it is 
typed on two lines. Bearing in mind that ‘\’ 
turns off special meanings, it seems relatively 
intuitive that a ‘\’ at the end of a line would 
make the newline there no longer special. 


You can in fact make a single line into 
several lines with this same mechanism. As a 
large example, consider underlining the word 
‘very’ in a long line by splitting ‘very’ onto a 
separate line, and preceding it by the roff or nroff 
formatting command ‘.ul’. 


lext a very big fext 
The command 


s/averyo/\ 
cul\ 


very\ 
/ 


converts the line into four shorter lines, preced- 
ing the word ‘very’ by the line ‘.ul’, and elim- 
inating the spaces around the ‘very’, all at the 
same time. 


When a newline is substituted in, dot is 
left pointing at the last line created. 


Joining Lines 

Lines may also be joined together, but this 
is done with the j command instead of s. Given 
the lines 

Now is 

asthe time 


and supposing that dot is set to the first of them, 


then the command 
j 
joins them together. No blanks are added, which 


is why we carefully showed a blank at the begin- 
ning of the second line. 


All by itself, a j command joins line dot to 
line dot+1, but any contiguous set of lines can 
be joined. Just specify the starting and ending 
line numbers. For example, 


1,Sjp 


joins all the lines into one big one and prints it. 
(More on line numbers in Section 3.) 


Rearranging a Line with \(... \) 


(This section should be skipped on first 
reading.) Recall that ‘&’ is a shorthand that 
stands for whatever was matched by the left side 
of an s command. In much the same way you 
can capture separate pieces of what was matched; 
the only difference is that you have to specify on 
the left side just what pieces you're interested in. 


Suppose, for instance, that you have a file 
of lines that consist of names in the form 


Smith, A. B. 
Jones, C. 


and so on, and you want the initials to precede 
the name, as in 


A. B. Smith 
C. Jones 


It is possible to do this with a series of editing 
commands, but it is tedious and error-prone. (It 
is instructive to figure out how it is done, 
though.) 


The alternative is to ‘tag’ the pieces of the 
pattern (in this case, the last name, and the ini- 
tials), and then rearrange the pieces. On the left 
side of a substitution, if part of the pattern is 
enclosed between \( and \), whatever matched 
that part is remembered, and available for use on 
the right side. On the right side, the symbol ‘\1’ 
refers to whatever matched the first \(...\) pair, 
‘\2’ to the second \(...\), and so on. 


The command 
LSs/-\(P J) a NC AN2a\1/ 


although hard to read, does the job. The first 
\(...\) matches the last name, which is any string 
up to the comma; this is referred to on the right 
side with ‘\l'. The second \(...\) is whatever 
follows the comma and any spaces, and is 
referred to as *\2° 


Of course, with any editing sequence this 
complicated, it’s foolhardy to simply run it and 
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hope. The global commands g and v discussed 
in section 4 provide a way for you to print 
exactly those lines which were affected by the 
substitute command, and thus verify that it did 
what you wanted in all cases. 


3. LINE ADDRESSING IN THE EDITOR 


The next general area we will discuss is 
that of line addressing in ed, that is, how you 
specify what lines are to be affected by editing 
commands. We have already used constructions 
like 


1,$s/x/y/ 


to specify a change on all! lines. And most users 
are long since familiar with using a single new- 
line (or return) to print the next line, and with 


/thing/ 


to find a line that contains ‘thing’. Less familiar, 
surprisingly enough, is the use of 


*thing? 


to scan backwards for the previous occurrence of 
‘thing’. This is especially handy when you real- 
ize that the thing you want to operate on is back 
up the page from where you are currently edit- 
ing. 

The slash and question mark are the only 
characters you can use to delimit a context 
search, though you can uSe essentially any char- 
acter in a substitute command. 


Address Arithmetic 


The next step is to combine the line 
numbers like ‘.’, ‘S’, ‘/.../° and ‘*?...?° with ‘+’ 
and ‘—’. Thus 

S—1 


is a command to print the next to last line of the 
current file (that is, one line before line ‘S’). 
For example, to recall how far you got in a previ- 
ous editing session, 


$—5,Sp 


prints the last six lines. (Be sure you understand 
why it’s six, not five.) If there aren't six, of 
course, you'll get an error message. 


As another example, 
-—3,.+3p 


prints from three lines before where you are now 
(at line dot) to three lines after, thus giving you 
a bit of context. Bv the way. the *+' can be 
omitted: 

.73,.5p 


is absolutely tdentical in meaning. 
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Another area in which you can save typing 
effort in specifying lines is to use ‘—* and ‘+° as 
line numbers by themselves. 


by itself is a command to move back up one line 
in the file. In fact, you can string several minus 
signs together to move back up that many lines: 


moves up three lines, as does ‘~3°. Thus 


—3,+3p 
is also identical to the examples above. 

Since *—’" is shorter than *.— 1°, construc- 
tions like 

— ,.S/bad/good/ 


are useful. This changes ‘bad’ to ‘good’ on the 
previous line and on the current line. 


‘+’ and ‘—* can be used in combination 
with searches using ‘/.../° and ‘?...?°, and with 
‘$°. The search 

/thing/ -— — 


finds the line containing ‘thing’, and positions 
you two lines before it. 


Repeated Searches 
Suppose you ask for the search 


/horrible thing/ 


and when the line is printed you discover that it 
isn’t the horrible thing that you wanted, so it is 
necessary to repeat the search again. You don’t 
have to re-type the search, for the construction 


// 


is a shorthand for ‘the previous thing that was 
searched for’, whatever it was. This can be 
repeated as many times as necessary. You can 
also go backwards: 


?9 
searches for the same thing, but in the reverse 
direction. 


Not only can you repeat the search, but 
you can use ‘//" as the left side of a substitute 
command, to mean ‘the most recent pattern’. 

‘horrible thing/ 

.... ed prints line with ‘horrible thing’ ... 

s//good/p 
To go backwards and change a line, say 

2 ?s//good/ 


Of course, vou can still use the ‘&* on the right 
hand side of a substitute to stand for whatever 


got matched: 
I1s/1&o&/p 


finds the next occurrence of whatever you 
searched for last, replaces it by two copies of 
itself, then prints the line just to verify that it 
worked. 


Default Line Numbers and the Value of Dot 


One of the most effective ways to speed up 
your editing is always to know what lines will be 
affected by a command if you don’t specify the 
lines it is to act on, and on what line you will be 
positioned (i.e., the value of dot) when a com- 
mand finishes. If you can edit without specifying 
unnecessary line numbers, you can save a lot of 
typing. 

As the most obvious example, if you issue 
a search command like 


/thing/ 


you are left pointing at the next line that con- 
tains ‘thing’. Then no address is required with 
commands like s to make a substitution on that 
line, or p to print it, or 1 to list it, or d to delete 
it, or a to append text after it, or c to change it, 
or ito insert text before it. 


What happens if there was no ‘thing’? 
Then you are left right where you were — dot is 
unchanged. This is also true if you were sitting 
on the only ‘thing’ when you issued the com- 
mand. The same rules hold for searches that use 
‘2...7. the only difference is the direction in 
which you search. 


The delete command d leaves dot pointing 
at the line that followed the last deleted line. 
When line ‘S$’ gets deleted, however, dot points 
at the new line ‘S$’. 


The line-changing commands a, ¢ and i by 
default all affect the current line — if you give 
no line number with them, a appends text after 
the current line, c changes the current line, and i 
inserts text before the current line. 


a, c, and i behave identically in one 
respect — when you stop appending, changing or 
inserting, dot points at the last line entered. 
This is exactly what you want for typing and edit- 
ing on the fly. For example, you can say 


a 

we COXE 

... botch ... (minor error) 
s/botch/correct/ (fix botched line) 
a 

.. More text... 


without specifying any line number for the sub- 


stitute command or for the second append com- 
mand. Or you can say 


a 
we CEXE 

.. horrible botch ... 
¢c (replace entire tine) 
... fixed up line ... 


(major error) 


You should experiment to determine what 
happens if you add vo lines with a, cor i. 


The r command will read a file into the 
text being edited, either at the end if you give no 
address, or after the specified line if you do. In 
either case, dot points at the last line read in. 
Remember that you can even say Ur to read a 
file in at the beginning of the text. (You can 
also say 0a or lito start adding text at the begin- 
ning.) 


The w command writes out the entire file. 
If you precede the command by one line 
number, that line is written, while if you precede 
it by two line numbers, that range of lines is 
written. The w command does wo: change dot: 
the current line remains the same, regardless of 
what lines are written. This is true even if you 
say something like 


/*\.AB/,/*\.AE/w abstract 


which involves a context search. 


Since the w command is so easy to use, 
you should save what you are editing regularly as 
you go along just in case the system crashes, or 
in case you do something foolish, like clobbering 
what you’re editing. 


The least intuitive behavior, in a sense, is 
that of the s command. The rule is simple ~— 
you are left sitting on the last line that got 
changed. If there were no changes, then dot is 
unchanged. 


To illustrate, suppose that there are three 
lines in the buffer, and you are sitting on the 
middle one: 


x1 
x2 
x3 


Then the command 
—,+s/x/y/p 


prints the third line, which is the last one 
changed. But if the three lines had been 


x1 
y2 
y3 


and the same command had been issued while 
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dot pointed at the second line, then the result 
would be to change and print only the-first line, 
and that is where dot would be set. 


Semicolon °;’ 


Searches with ‘/.../° and *?...?° start at the 
current line and move forward or backward 
respectively until they either find the pattern or 
get back to the current line. Sometimes this is 
not what is wanted. Suppose, for example, that 
the buffer contains lines like this: 


Starting at line 1, one would expect that the 
command 


/a/,/b/p 


prints all the lines from the ‘ab’ to the ‘be’ 
inclusive. Actually this is not what happens. 
Both searches (for ‘a’ and for ‘b’) start from the 
same point, and thus they both find the line that 
contains ‘ab’. The result is to print a single line. 
Worse, if there had been a line with a ‘b’ in it 
before the ‘ab’ line, then the print command 
would be in error, since the second line number 
would be less than the first, and it is illegal to try 
to print lines in reverse order. 


This is because the comma separator for 
line numbers doesn't set dot as each address is 
processed: each search starts from the same 
place. In ed, the semicolon ‘:° can be used just 
like comma, with the single difference that use 
of a semicolon forces dot to be set at that point 
as the line numbers are being evaluated. In 
effect, the semicolon ‘moves’ dot. Thus in our 


example above, the command 
/a/./b/p 


prints the range of lines from ‘ab’ to ‘be’, 
because after the ‘a’ is found, dot is set to that 
line, and then ‘b’ is searched for, starting beyond 
that line. 


This property is most often useful in a 
very simple situation. Suppose you want to find 
the second occurrence of ‘thing’. You could say 


/thing/ 
// 


but this prints the first occurrence as well as the 
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second, and is a nuisance when you know very 
well that it is only the second one you're 
interested in. The solution is to say 


/thing/;// 


This says to find the first occurrence of ‘thing’, 
set dot to that line, then find the second and 
print only that. 


Closely related is searching for the second 
previous occurrence of something, as in 


?something?;?? 


Printing the third or fourth or ... in either direc- 
tion is left as an exercise. 


Finally, bear in mind that if you want to 
find the first occurrence of something in a file, 
Starting at an arbitrary place within the file, it is 
not sufficient to say 


1,/thing/ 


because this fails if ‘thing’ occurs on line |. But 
iC is possible to say 


0;/thing/ 


(one of the few places where 0 is a legal line 
number), for this starts the search at line I. 


Interrupting the Editor 


As a final note on what dot gets set to, you 
should be aware that if you hit the interrupt or 
delete or rubout or break key while ed is doing a 
command, things are put back together again and 
your state is restored as much as possible to what 
it was before the command began. Naturally, 
some changes are irrevocable — if you are read- 


ing or writing a file or making substitutions or - 


deleting lines, these will be stopped in some 
clean but unpredictable state in the middle 
(which is why it is not usually wise to stop 
them). Dot may or may not be changed. 


Printing is more clear cut. Dot is not 
changed until the printing is done. Thus if you 
print until you see an interesting line, then hit 
delete, you are vor sitting on that line or even 
near it. Dot is left where it was when the pcom- 
mand was started. 


4. GLOBAL COMMANDS 


The global commands g and v are used to 
perform one or more editing commands on all 
lines that either contain (g) or don’t contain (vy) 
a specified pattern. 


As the simplest example, the command 


g/UNIX/p 


prints all lines that contain the word “UNIX’. 
The pattern that goes between the slashes can be 


anything that could be used in a line search or in 
a substitute command, exactly the same rules 
and limitations apply. 


As another example, then, 
g/\./p 


prints all the formatting commands in a file 
(lines that begin with ‘.*). 


The v command is identical to g, except 
that it operates on those line that do wor contain 
an occurrence of the pattern. (Don't look too 
hard for mnemonic significance to the letter ‘v’.) 
So 


v/*\./p 


prints all the lines that don't begin with ‘." — the 
actual text lines. 


The command that follows g or v can be 
anything: 


g/*\./d 

deletes all lines that begin with ‘.’, and 
g/~S/d 

deletes all empty lines. 


Probably the most useful command that 
can follow a global is the substitute command, 
for this can be used to make a change and print 
each affected line for verification. For example, 
we could change the word ‘Unix’ to ‘UNIX’ 
everywhere, and verify that it really worked, with 


g/Unix/s//UNIX/gp 


Notice that we used ‘//” in the substitute com- 
mand to mean ‘the previous pattern’, in this 
case, ‘Unix’. The p command is done on every 
line that matches the pattern, not just those on 
which a substitution took place. 


The global command operates by making 
two passes over the file. On the first pass, all 
lines that match the pattern are marked. On the 
second pass, each marked line in turn is exam- 
ined, dot is set to that line, and the command 
executed. This means that it is possible for the 
command that follows a g or v to use addresses, 
set dot, and so on, quite freely. 


g/*\.PP/+ 


prints the line that follows each ‘.PP’ command 
(the signal for a new paragraph in some format- 
ting packages). Remember that ‘+’ means ‘one 
line past dot’. And 


g/topic/?°\.SH?] 


searches for each line that contains ‘topic’, scans 
backwards until it finds a line that begins °.SH’ 
(a section heading) and prints the line that fol- 
lows that. thus showing the section headings 


under which ‘topic’ is mentioned. Finally, 
g/*\.EQ/ +,./°\.EN/-p 


prints all the lines that lie between lines begin- 
ning with ‘.EQ’ and ‘.EN’ formatting commands. 


The g and v commands can also be pre- 
ceded by line numbers, in which case the lines 
searched are only those in the range specified. 


Multi-line Global Commands 


It is possible to do more than one com- 
mand under the control of a global command, 
although the syntax for expressing the operation 
is not especially natural or pleasant. As an 
example, suppose the task is to change ‘x’ to ‘y’ 
and ‘a’ to ‘b’ on ail lines that contain ‘thing’. 
Then 


g/thing/s/x/y/\ 
s/a/lb/ 


is sufficient. The ‘\" signals the g command that 
the set of commands continues on the next line; 
it terminates on the first line that does not end 
with ‘\’. (As a minor blemish, you can't use a 
substitute command to insert a newline within a 
g command.) 


You should watch out for this problem: 
the command 


g/x/s//y/\ 
s/a/b/ 


does not work as you expect. The remembered 
pattern is the last pattern that was actually exe- 
cuted, so sometimes it will be ‘x’ (as expected), 
and sometimes it will be ‘a’ (mot expected). You 
must spell it out, like this: 


g/x/s/x/y/\ 
s/a/b/ 


It is also possible to execute a, ¢ and i 
commands under a global command; as with 
other multi-line constructions, all that is needed 
is to add a ‘\’ at the end of each line except the 
last. Thus to add a ‘.nf’ and ‘.sp’ command 
before each ‘.EQ’ line, type 


g/“\.EQ/i\ 
enf\ 
Sp 


There is no need for a final line containing a ‘.’ 
to terminate the i command, unless there are 
further commands being done under the global. 
On the other hand, it does no harm to put it in 
either. 
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5. CUT AND PASTE WITH UNIX COM- 
MANDS 


One editing area in which non- 
programmers seem not very confident is in what 
might be called ‘cut and paste’ operations — 
changing the name of a file, making a copy of a 
file somewhere else, moving a few lines from 
one place to another in a file, inserting one file in 
the middle of another, splitting a file into pieces, 
and splicing two or more files together. 


Yet most of these operations are actually 
quite easy, if you keep your wits about you and 
go cautiously. The next several sections talk 
about cut and paste. We will begin with the UNIX 
commands for moving entire files around, then 
discuss ed commands for operating on pieces of 
files. 


Changing the Name of a File 


You have a file named ‘memo’ and you 
want it to be called ‘paper’ instead. How is it 
done? 


The UNIX program that renames files is 
called mv (for ‘move’); it ‘moves’ the file from 
one name to another, like this: 


mv memo paper 


That’s all there is to it: mv from the old name to 
the new name. 


mv oldname newname 


Warning: if there is already a file around with the 
new name, its present contents will be silently 
clobbered by the information from the other file. 
The one exception is that you can’t move a file 
to itself — 


mv X xX 


is illegal. 


Making a Copy of a File 


Sometimes what you want is a copy of a 
file — an entirely fresh version. This might be 
because you want to work on a file, and yet save 
a copy in case something gets fouled up, or just 
because you're paranoid. 


In any case, the way to do it is with the cp 
command. (cp stands for ‘copy’; the system is 
big on short command names, which are appreci- 
ated by heavy users, but sometimes a strain for 
novices.) Suppose you have a file called ‘good’ 
and you want to save a copy before you make 
some dramatic editing changes. Choose a name 
— ‘savegood’ might be acceptable — then type 


cp good savegood 


This copies ‘good onto ‘savegood’, and vou now 
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have two identical copies of the file ‘good’. (if 
‘savegood’ previously contained something, it 
gets’ overwritten.) 


Now if you decide at some time that you 
want to get back to the original state of ‘good’, 
you can say 


mv savegood good 


(if you're not interested in ‘savegood’ any 
more), or 


cp savegood good 
if you still want to retain a safe copy. 


In summary, mv just renames a file. cp 
makes a duplicate copy. Both of them clobber 
the ‘target’ file if it already exists, so you had 
better be sure that’s what you want to do de/ore 
you do it. 


Removing a File 


If you decide you are really done with a 
file forever, you can remove it with the rm com- 
mand: 


rm savegood 


throws away (irrevocably) the file called 
‘savegood’. 


Putting Two or More Files Together 


The next step is the familiar one of collect- 
ing two or more files into one big one. This will 
be needed, for example, when the author of a 
paper decides that several sections need to be 
combined into one. There are several ways to do 
it, of which the cleanest, once you get used to it, 
is a program called cat. (Not a// programs have 
two-letter names.) cat is short for ‘concatenate’, 
which is exactly what we want to do. 


Suppose the job is to combine the files 
‘filel’ and ‘file2° into a single file called ‘bigfile’. 
If you say 


cat file 


the contents of ‘file’ will get printed on vour ter- 
minal. If you say 


cat filel file2 


the contents of ‘filel’ and then the contents of 
‘file2’ will both be printed on your terminal, in 
that order. So cat combines the files, all right, 
but it’s not much help to print them on the ter- 
minal — we want them in ‘bigfile’. 


Fortunately, there is a way. You can tell 
the system that instead of printing on vour ter- 
minal, you want the same information put in a 
file. The way to do it is to add to the command 
line the character > and the name of the file 


where you want the output to go. Then you can 
say 


cat file! file2 >bigfile 


and the job is done. (As with cp and my, you're 
putting something into ‘bigfile’, and anything 
that was already there is destroyed.) 


This ability to ‘capture’ the output of a 
program is one of the most useful aspects of the 
system. Fortunately it’s not limited to the cat 
program — you can use it with avy program that 
prints on your terminal. We'll see some more 
uses for it in a moment. 


Naturally, you can combine several files, 
not just two: 


cat file! file2 file3 > bigfile 


collects a whole bunch. 
Question: is there any difference between 


cp good savegood 
and 
cat good >savegood 


Answer: for most purposes, no. You might rea- 
sonably ask why there are two programs in that 
case, since cat is obviously all you need. The 
answer is that cp will do some other things as 
well, which you can investigate for yourself by 
reading the manual. For now we'll stick to sim- 
ple usages. 


Adding Something to the End of a File 


Sometimes you want to add one file to the 
end of another. We have enough building blocks 
now that you can do it; in fact before reading 
further it would be valuable if you figured out 
how. To be specific, how would you use cp, mv 
and/or cat to add the file ‘good!° to the end of 
the file ‘good’? 


You could try 


cat good good! >temp 
mv temp good 


which is probably most direct. You should also 
understand why 


cat good goodl >good 


doesn’t work. (Don't practice with a good 
*good’!) 

The easy way is to use a variant of >, 
called >>. In fact, >> is identical to > except 
that instead of clobbering the old file, it simply 
tacks stuff on at the end. Thus you could sav 


cat good! > >good 


and ‘good1!" ts added to the end of ‘good’. (And 


if ‘good’ didn’t exist, this makes a copy of 
*good1* called ‘good’.) 


6. CUT AND PASTE WITH THE EDITOR 


Now we move on to manipulating pieces 
of files — individual lines or groups of lines. 
This is another area where new users seem 
unsure of themselves. 


Filenames 


The first step is to ensure that you know 
the ed commands for reading and writing files. 
Of course you can't go very far without knowing 
rand w. Equally useful, but less well known, is 
the ‘edit’ command e. Within ed, the command 


e newfile 


says ‘I want to edit a new file called newfile, 
without leaving the editor.” The e command dis- 
cards whatever you're currently working on and 
starts over on newfile. It’s exactly the same as if 
you had quit with the q command, then re- 
entered ed with a new file name, except that if 
you have a pattern remembered, then a com- 
mand like // will still work. 


If you enter ed with the command 
ed file 


ed remembers the name of the file, and any sub- 
sequent e, ror w commands that don’t contain a 
filename will refer to this remembered file. Thus 


ed file! 

... (editing) ... 

w (writes back in filel) 

e file2 (edit new file, without leaving editor) 
... (editing on file2) ... 

w (writes back on file2) 


{and so on) does a series of edits on various files 
without ever leaving ed and without typing the 
name of any file more than once. (As an aside, 
if you examine the sequence of commands here, 
you can see why many UNIX systems use e as a 
synonym for ed.) 


You can find out the remembered file 
name at any time with the f command, just type 
f without a file name. You can also change the 
name of the remembered file name with f; a use- 
ful sequence is 


ed precious 
f junk 
... (editing) ... 
which gets a copy of a precious file, then uses f 


to guarantee that a careless w command won't 
clobber the original. 
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Inserting One File into Another 


Suppose you have a file called ‘memo’, 
and vou want the file called ‘table’ to be inserted 
just after the reference to Table 1. That is, in 
‘memo’ somewhere is a line that says 


Table | shows that ... 


and the data contained in ‘table’ has to go there, 
probably so it will be formatted properly by nroff 
or troff. Now what? 


This one is easy. Edit ‘“memo’, find ‘Table 
I’, and add the file ‘table’ right there: 


ed memo 

/Table 1/ 

Table | shows that... [response fram ed] 
wr table 


The critical line is the last one. As we said ear- 
lier, the r command reads a file. here you asked 
for it to be read in right after line dot. An cr 
command without any address adds lines at the 
end, so it is the same as Sr. 


Writing out Part of a File 


The other side of the coin is writing out 
part of the document you're editing. For exam- 
ple, maybe you want to split out into a separate 
file that table from the previous example, so it 
can be formatted and tested separately. Suppose 
that in the file being edited we have 


TS 
...{lots of stuff] 
TE 


which is the way a table is set up for the tbl pro- 
gram. To isolate the table in a separate file 
called ‘table’, first find the start of the table (the 
*.TS* line), then write out the interesting part: 


/*\.TS/ 
«TS led prints the line it found] 
../°\ .TE/w table 


and the job is done. If vou are confident, you 
can do it all at once with 
t 


/°\.TS/;/°\ .TE/w table 


The point is that the w command can write 
out a group of lines, instead of the whole file.. In 
fact, you can write out a single line if you like; 
just give one line number instead of two. For 
example, if you have just typed a horribly com- 
plicated line and you know that it (or something 
like it) is going to be needed later, then save it 
— don't re-type it. In the editor, say 
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a 
...lots of stuff... 
.. horrible line... 
-w temp 

a 

«More stuff... 
wr temp 

a 

eeeMmore stuff... 


This last example is worth studying, to be sure 
you appreciate what's going on. 


Moving Lines Around 


Suppose you want to move a paragraph 
from its present position in a paper to the end. 
How would you do it? As a concrete example, 
suppose each paragraph in the paper begins with 
the formatting command ‘.PP’. Think about it 
and write down the details before reading on. 


The brute force way (not necessarily bad) 
is to write the paragraph onto a temporary file, 
delete it from its current position, then read in 
the temporary file at the end. Assuming that 
you are sitting on the ‘.PP’ command that begins 
the paragraph, this is the sequence of commands: 


../°-\.PP/—w temp 
.//—-d 
$r temp 


That is, from where you are now (‘.’) until one 
line before the next °.PP’ (‘/*\.PP/—") write 
onto ‘temp’. Then delete the same _ lines. 
Finally, read ‘temp’ at the end. 


As we Said, that’s the brute force way. 
The easier way (often) is to use the move com- 
mand m that ed provides — it lets you do the 
whole set of operations at one crack, without any 
temporary file. 


The m command is like many other ed 
commands in that it takes up to two line 
numbers in front that tell what lines are to be 
affected. It is also followed by a line number that 
tells where the lines are to go. Thus 


linel, line2 m line3 


Says to move all the lines between ‘linel’ and 
‘line2’ after ‘line3’. Naturally, any of ‘linel’ 
etc., can be patterns between slashes, $ signs, or 
other ways to specify lines. 

Suppose again that you're sitting at the 
first line of the paragraph. Then you can say 


../-\.PP/—m$ 
That’s all. 


As another example of a frequent opera- 
tion, you can reverse the order of two adjacent 
lines by moving the first one to after the second. 
Suppose that you are positioned at the first. 
Then 


m+ 


does it. It says to move line dot to after one line 
after line dot. If you are positioned on the 
second line, 


m — — 
does the interchange. 


As you can see, the m command is more 
succinct and direct than writing, deleting and re- 
reading. When is brute force better anyway? 
This is a matter of personal taste — do what you 
have most confidence in. The main difficulty 
with the m command is that if you use patterns 
to specify both the lines you are moving and the 
target, you have to take care that you specify 
them properly, or you may well not move the 
lines you thought you did. The result of a 
botched m command can be a ghastly mess. 
Doing the job a step at a time makes it easier for 
you to verify at each step that you accomplished 
what you wanted to. It’s also a good idea to 
issue a w command before doing anything com- 
plicated; then if you goof, it’s easy to back up to 
where you were. 


Marks 


ed provides a facility for marking a line 
with a particular name so you can later reference 
it by name regardless of its actual line number. 
This can be handy for moving lines, and for 
keeping track of them as they move. The mark 
command is k:; the command 


kx 


marks the current line with the name ‘x’. If a 
line number precedes the k, that line is marked. 
(The mark name must be a single lower case 
letter.) Now you can refer to the marked line 
with the address 


, 


X 


Marks are most useful for moving things 
around. Find the first line of the block to be 
moved, and mark it with @. Then find the last 
line and mark it with ‘6. Now position yourself 
at the place where the stuff is to go and say 


‘a,’bm. 
Bear in mind that only one line can have a 


particular mark name associated with it at any 
given time. 


Copying Lines 


We mentioned earlier the idea of saving a 
line that was hard to type or used often, so as to 
cut down on typing time. Of course this could 
be more than one line; then the saving is 
presumably even greater. 


ed provides another command, called ¢ 
(for ‘transfer’) for making a copy of a group of 
one or more lines at any point. This is often 
easier than writing and reading. 


The t command is identical to the m com- 
mand, except that instead of moving lines it sim- 
ply duplicates them at the place you named. 
Thus 


1,St$ 


duplicates the entire contents that you are edit- 
ing. A more common use for t is for creating a 
series of lines that differ only slightly. For 
example, you can say 


a 
tetoere: X .... (long line) 
te (make a copy) 
s/x/y/ (change it a bit) 
t. (make third copy) 
s/y/z/ (change it a bit) 
and so on. 


oer 


The Temporary Escape 


Sometimes it is convenient to be able to 
temporarily escape from the editor to do some 
other UNIX command, perhaps one of the file 
copy or move commands discussed in section 5, 
without leaving the editor. The ‘escape’ com- 
mand ! provides a way to do this. 


If you say 
‘any UNIX command 


your current editing state is suspended, and the 
UNIX command you asked for is executed. When 
the command finishes, ed will signal you by 
printing another !, at that point you can resume 
editing. 


You can really do aay UNIX command, 
including another ed. (This is quite common, in 
fact.) In this case, you can even do another !. 


7. SUPPORTING TOOLS 


There are several tools and techniques that 
go along with the editor, all of which are rela- 
tively easy once you know how ed = works, 
because they are all based on the editor. In this 
section we will give some fairly cursory examples 
of these toois, more to indicate their existence 
than to provide a complete tutorial. More infor- 
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mation on each can be found in [3]. 


Grep 

Sometimes you want to find ail 
occurrences of some word or pattern in a set of 
files, to edit them or perhaps just to verify their 
presence or absence. It may be possible to edit 
each file separately and look for the pattern of 
interest, but if there are many files this can get 
very tedious, and if the files are really big, it may 
be impossible because of limits in ed. 


The program grep was invented to get 
around these limitations. The search patterns 
that we have described in the paper are often 
called ‘regular expressions’, and ‘grep* stands for 


g/re/p 


That describes exactly what grep does — it prints 
every line in a set of files that contains a particu- 
lar pattern. Thus 


grep ‘thing’ filel file2 file3 


finds ‘thing’ wherever it occurs in any of the files 
‘filel’, ‘file2’, etc. grep also indicates the file in 
which the line was found, so you can later edit it 
if you like. 


The pattern represented by ‘thing’ can be 
any pattern you can use in the editor, since grep 
and ed use exactly the same mechanism for pat- 
tern searching. It is wisest always to enclose the 
pattern in the single quotes ’...’ if it contains any 
non-alphabetic characters, since many such char- 
acters also mean something special to the UNIX 
command interpreter (the ‘shell’). If you don’t 
quote them, the command interpreter will try to 
interpret them before grep gets a chance. 


There is also a way to find lines that don't 
contain a pattern: 


grep —v ‘thing’ filel file2 


finds all lines that don’t contains ‘thing’. The 
—yv must occur in the position shown. Given 
grep and grep —v, it is possible to do things like 
selecting all lines that contain some combination 
of patterns. For example, to get all lines that 
contain ‘x’ but not ‘y’: 


grep x file... | grep —v y 


(The notation | is a ‘pipe’, which causes the out- 
put of the first command to be used as input to 
the second command; see (2].) 


Editing Scripts 


If a fairly complicated set of editing opera- 
tions is to be done on a whole set of files, the 
easiest thing to do is to make up a ‘script’, j.e.. a 
file that contains the operations vou want to per- 
form, then apply this script to each fie in turn. 
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For example, suppose you want to change 
every ‘Unix’ to ‘UNIX* and every ‘Geos’ to 
‘GCOS’ in a large number of files. Then put 
into the file ‘script’ the lines 


g/Unix/s//UNIX/g 
g/Gcos/s//GCOS/g 


WwW 
q 


Now vou can say 


ed file! <script 
ed file2 <script 


This causes ed to take its commands from the 
prepared script. Notice that the whole job has to 
be planned in advance. 


And of course by using the UNIX command 
interpreter, you can cycle through a set of files 
automatically, with varying degrees of ease. 


Sed 


sed (‘stream editor’) is a version of the 
editor with restricted capabilities but which is 
capable of processing unlimited amounts of 
input. Basically sed copies its input to its output, 
applying one or more editing commands to each 
line of input. 


As an example, suppose that we want to 
do the ‘Unix’ to ‘UNIX* part of the example 
given above, but without rewriting the files. 
Then the command 


sed ‘'s/Unix/UNIX/g’ filel file2 ... 


applies the command ‘s/Unix/UNIX/g’ to all 
lines from ‘filel’, ‘file2°, etc.. and copies all lines 
to the output. The advantage of using sed in 
such a case is that it can be used with input too 
large for ed to handle. All the output can be col- 
lected in one place, either in a file or perhaps 
piped into another program. 


If the editing transformation is so compli- 
cated that more than one editing command is 
needed, commands can be supplied from a file, 
or on the command line, with a slightly more 
complex syntax. To take commands from a file, 
for example, 


sed —f cmdfile input —files... 

sed has further capabilities, including con- 
ditional testing and branching, which we cannot 
go into here. 
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1. Getting started 


This document provides a quick introduction to wi. (Pronounced vee-eye.) You should be 
running vi on a file you are familiar with while you are reading this. The first part of this docu- 
ment (sections 1 through 5) describes the basics of using vi. Some topics of special interest are 
presented in section 6, and some nitty-gritty details of how the editor functions are saved for 
section 7 to avoid cluttering the presentation here. 


There is also a short appendix here, which gives for each character the special meanings 
which this character has in w. Attached to this document should be a quick reference card. 
This card summarizes the commands of wi in a very compact format. You should have the card 
handy while you are learning vi. 


1.1. Specifying terminal type 


Before you can start wi you must tell the system what kind of terminal you are using. 
Here is a (necessarily incomplete) list of terminal type codes. If your terminal does not appear 
here, you should consult with one of the staff members on your system to find out the code for 
your terminal. If your terminal does not have a code, one can be assigned and a description for 
the terminal can be created. 


Code Full name Type 
2621 Hewlett-Packard 2621 A/P Intelligent 
2645 Hewlett-Packard 264x Intelligent 
act4 Microterm ACT-IV Dumb 
act5 Microterm ACT-V Dumb 
adm3a Lear Siegler ADM-3a Dumb 
adm3 1 Lear Siegler ADM-31 Intelligent 
c100 Human Design Concept 100 Intelligent 
dm1520 Datamedia 1520 Dumb 
dm2500 Datamedia 2500 Intelligent 
dm3025 Datamedia 3025 Intelligent 
fox Perkin-Elmer Fox Dumb 
h1500 Hazeltine 1500 Intelligent 
hi19 Heathkit h19 Intelligent 
1100 Infoton 100 Intelligent 
mime Imitating a smart act4 Intelligent 


The financial support of an 18m Graduate Fellowship and the National Science Foundation under grants 
MCS74-07644-A03 and MCS78-07291 is gratefully acknowledged. 
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11061 Teleray 1061 Intelligent 
vt52 Dec VT-52 Dumb 


Suppose for example that you have a Hewlett-Packard HP2621A terminal. The code used 
by the system for this terminal is ‘2621’. In this case you can use one of the following com- 
mands to tell the system the type of your terminal: 


% seteny TERM 2621 


This command works with the shell cs on both version 6 and 7 systems. If you are using the 
standard version 7 shell then you should give the commands 


$ TERM=2621 
$ export TERM 


If you want to arrange to have your terminal type set up automatically when you log in, 
you can use the tse program. If you dial in on a mime, but often use hardwired ports, a typical 
line for your ./ogin file (if you use csh) would be 


setenv TERM ‘tset ~— —d mime’ 
or for your .profile file (if you use sh) 
TERM='tset — —d mime’ 


Tset knows which terminals are hardwired to each port and needs only to be told that when you 
dial in you are probably on a mime. Tser is usually used to change the erase and kill characters, 
too. 


1.2. Editing a file 


After telling the system which kind of terminal you have, you should make a copy of a 
file you are familiar with, and run vi on this file, giving the command 


% vi name 


replacing name with the name of the copy file you just created. The screen should clear and the 
text of your file should appear on the screen. If something else happens refer to the footnote. 


1.3. The editor’s copy: the buffer 


The editor does not directly modify the file which you are editing. Rather, the editor 
makes a copy of this file, in a place called the buffer, and remembers the file’s name. You do 
not affect the contents of the file unless and until you write the changes you make back into the 
original file. 


+ If you gave the system an incorrect terminal type code then the editor may have just made a mess out of 
your screen. This happens when it sends control codes for one kind of terminal to some other kind of termi- 
nal. In this case hit the keys :q (colon and the q key) and then hit the RETURN key. This should get you back 
to the command level interpreter. Figure out what you did wrong (ask someone else if necessary) and try 
again. 

Another thing which can go wrong is that you typed the wrong file name and the editor just printed an 
error diagnostic. In this case you should follow the above procedure for getting out of the editor, and try 
again this time spelling the file name correctly. 

If the editor doesn’t seem to respond to the commands which you type here. try sending an interrupt to it 
by hitting the DEL or RUB key on your terminal, and then. hitting the :q command again followed by a carriage 
return. 
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1.4. Notational conventions 


In our examples, input which must be typed as is will be presented in bold face. Text 
which should be replaced with appropriate input will be given in italics. We will represent spe- 
cial characters in SMALL CAPITALS. 


1.5. Arrow keys 


The editor command set is independent of the terminal you are using. On most terminals 
with cursor positioning keys, these keys will also work within the editor. If you don’t have cur- 
sor positioning keys, or even if you do, you can use the h j k and | keys as cursor positioning 
keys (these are labelled with arrows on an adm3a).* 


(Particular note for the HP2621: on this terminal the function keys must be shifted (ick) 
to send to the machine, otherwise they only act locally. Unshifted use will leave the cursor 
positioned incorrectly.) 


1.6. Special characters: ESC, CR and DEL 


Several of these special characters are very important, so be sure to find them right now. 
Look on your keyboard for a key labelled ESC or ALT. It should be near the upper left corner of 
your terminal. Try hitting this key a few times. The editor will ring the bell to indicate that it 
is in a quiescent state.+ Partially formed commands are cancelled by ESC, and when you insert 
text in the file you end the text insertion with ESc. This key is a fairly harmless one to hit, so 
you can just hit it if you don’t know what is going on until the editor rings the bell. 


The CR or RETURN key is important because it is used to terminate certain commands. It 
is usually at the right side of the keyboard, and is the same command used at the end of each 
shell command. 


Another very useful key is the DEL or RUB key, which generates an interrupt, telling the 
editor to stop what it is doing. It is a forceful way of making the editor listen to you, or to 
return it to the quiescent state if you don’t know or don’t like what is going on. Try hitting the 
‘/’ key on your terminal. This key is used when you want to specify a string to be searched for. 
The cursor should now be positioned at the bottom line of the terminal after a ‘/’ printed as a 
prompt. You can get the cursor back to the current position by hitting the DEL or RUB key; try 
this now.* From now on we will simply refer to hitting the DEL or RUB key as ‘‘sending an 
interrupt.’’** 


The editor often echoes your commands on the last line of the terminal. If the cursor is 
on the first position of this last line, then the editor is performing a computation, such as com- 
puting a new position in the file after a search or running a command to reformat part of the 
buffer. When this is happening you can stop the editor by sending an interrupt. 


1.7. Getting out of the editor 


After you have worked with this introduction for a while, and you wish to do something 
else, you can give the command ZZ to the editor. This will write the contents of the editor’s 
buffer back into the file you are editing, if you made any changes, and then quit from the edi- 
tor. You can also end an editor session by giving the command :q!CR:f this is a dangerous but 
occasionally essential command which ends the editor session and discards all your changes. 
You need to know about this command in case you change the editor’s copy of a file you wish 


* As we will see later, 4 moves back to the left (like control-h which is a backspace), j/ moves down (in the 
same column), k moves up (in the same column), and / moves to the right. 

+ On smart terminals where it is possible, the editor will quietly flash the screen rather than ringing the beil. 

* Backspacing over the ‘/’ will aiso cance! the search. 

** On some systems, this interruptibility comes al a price: you cannot type ahead when the editor is comput- 
ing with the cursor on the bottom line. 

t All commands which read from the last display line can also be terminated with a Esc as well as an CR. 
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only to look at. Be very careful not to give this command when you really want to save the 
changes you have made. 


2. Moving around in the file 


2.1. Scrolling and paging 


The editor has a number of commands for moving around in the file. The most useful of 
these is generated by hitting the control and D keys at the same time, a control-D or ‘“D’. We 
will use this two character notation for referring to these control keys from now on. You may 
have a key labelled ‘~’ on your terminal. This key will be represented as ‘f’ in this document, 
*“’ is exclusively used as part of the ‘“x’ notation for control characters. 


As you know now if you tried hitting “D, this command scrolls down in the file. The D 
thus stands for down. Many editor commands are mnemonic and this makes them much easier 
to remember. For instance the command to scroll up is “U. Many dumb terminals can’t scroll 
up at all, in which case hitting “U clears the screen and refreshes it with a line which is farther 
back in the file at the top. 


If you want to see more of the file below where you are, you can hit “E to expose one 
more line at the bottom of the screen, leaving the cursor where it is. +# The command “Y 
(which is hopelessly non-mnemonic, but next to “U on the keyboard) exposes one more line at 
the top of the screen. 


There are other ways to move around in the file; the keys “F and “B + move forward and 
backward a page, keeping a couple of lines of continuity between screens so that it is possible to 
read through a file using these rather than “D and “U if you wish. 


Notice the difference between scrolling and paging. If you are trying to read the text in a 
file, hitting “F to move forward a page will leave you only a little context to look back at. 
Scrolling on the other hand leaves more context, and happens more smoothly. You can con- 
tinue to read the text as scrolling is taking place. 


2.2. Searching, goto, and previous context 


Another way to position yourself in the file is by giving the editor a string to search for. 
Type the character / followed by a string of characters terminated by CR. The editor will posi- 
tion the cursor at the next occurrence of this string. Try hitting n to then go to the next 
occurrence of this string. The character ? will search backwards from where you are, and is 
otherwise like /.f 


If the search string you give the editor is not present in the file the editor will print a diag- 
nostic on the last line of the screen, and the cursor will be returned to its initial position. 


If you wish the search to match only at the beginning of a line, begin the search string 
with an f. To match only at the end of a line, end the search string with a $. Thus /fsearchcr 
will search for the word ‘search’ at the beginning of a line, and /lastScR searches for the word 
‘last’ at the end of a line.* 


+ If you don’t have a ‘*’ key on your terminal then there is probably a key labelled ‘[’; in any case these 
characters are one and the same. 

+t Version 3 only. 

+ Not available in ail v2 editors due to memory constraints. 

t These searches will normally wrap around the end of the file, and thus find the string even if it is not on a 
line in the direction you search provided it is anywhere else in the file. You can disable this wraparound in 
scans by giving the command :se nowrapscancr, or more briefly :se nowscr. 

“Actually, the string you give to search for here can be a regular expression in the sense of the editors ex(1) 
and ed(1). If you don’t wish to learn about this yet, you can disable this more general facility by doing 
se Nomagiccr: by putting this command in EXINIT in your environment, you can have this always be in 
effect (more about EX/NIT later.) 
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The command G, when preceded by a number will position the cursor at that line in the 
file. Thus 1G will move the cursor to the first line of the file. If you give G no count, then it 
moves to the end of the file. 


If you are near the end of the file, and the last line is not at the bottom of the screen, the 
editor will place only the character ‘~’ on each remaining line. This indicates that the last line 
in the file is on the screen; that is, the ‘~’ lines are past the end of the file. 


You can find out the state of the file you are editing by typing a “G. The editor will show 
you the name of the file you are editing, the number of the current line, the number of lines in 
the buffer, and the percentage of the way through the buffer which you are. Try doing this 
now, and remember the number of the line you are on. Give a G command to get to the end 
and then another G command to get back where you were. 


You can also get back to a previous position by using the command “ (two back quotes). 
This is often more convenient than G because it requires no advance preparation. Try giving a 
G or a search with / or ? and then a “ to get back to where you were. If you accidentally hit n 
or any command which moves you far away from a context of interest, you can quickly get 
back by hitting *. 


2.3. Moving around on the screen 


Now try just moving the cursor around on the screen. If your terminal has arrow keys (4 
or 5 keys with arrows going in each direction) try them and convince yourself that they work. 
(On certain terminals using v2 editors, they won’t.) If you don’t have working arrow keys, you 
can always use h, j, k, and 1. Experienced users of wi prefer these keys to arrow keys, because 
they are usually right underneath their fingers. 


Hit the + key. Each time you do, notice that the cursor advances to the next line in the 
file, at the first non-white position on the line. The — key is like + but goes the other way. 


These are very common keys for moving up and down lines in the file. Notice that if you 
go off the bottom or top with these keys then the screen will scroll down (and up if possible) to 
bring a line at a time into view. The RETURN key has the same effect as the + key. 


Vi also has commands to take you to the top, middle and bottom of the screen. H will 
take you to the top (home) line on the screen. Try preceding it with a number as in 3H. This 
will take you to the third line on the screen. Many w commands take preceding numbers and 
do interesting things with them. Try M, which takes you to the middle line on the screen, and 
L, which takes you to the last line on the screen. L also takes counts, thus 5L will take you to 
the fifth line from the bottom. 


2.4. Moving within a line 


Now try picking a word on some line on the screen, not the first word on the line. move 
the cursor using RETURN and — to be on the line where the word is. Try hitting the w key. 
This will advance the cursor to the next word on the line. Try hitting the b key to back up 
words in the line. Also try the e key which advances you to the end of the current word rather 
than to the beginning of the next word. Also try SPACE (the space bar) which moves right one 
character and the Bs (backspace or ~“H) key which moves left one character. The key h works 
as “H does and is useful if you don’t have a BS key. (Also, as noted just above, | will move to 
the right.) 


If the line had punctuation in it you may have noticed that that the w and b keys stopped 
at each group of punctuation. You can also go back and forwards words without stopping at 
punctuation by using W and B rather than the lower case equivalents. Think of these as bigger 
words. Try these on a few lines with punctuation to see how they differ from the lower case w 
and b. 


The word keys wrap around the end. of line, rather than stopping at the end. Try moving 
to a word on a line below where you are by repeatedly hitting w. 
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SPACE advance the cursor one position 
backwards to previous page 

scrolls down in the file 

exposes another line at the bottom (v3) 
forward to next page 

tell what is going on 

backspace the cursor 

next line, same column 

previous line, same column 

scrolls up in the file 

exposes another line at the top (v3) 
next line, at the beginning 

previous line, at the beginning 

scan for a following string forwards 
scan backwards 

back a word, ignoring punctuation 

go to specified line, last default 
home screen line 

middle screen line 

last screen line 

forward a word, ignoring punctuation 
back a word 

end of current word 

scan for next instance of / or ? pattern 
word after this word 


cad ezramtsowd 


> > > 


> 


qpeorgremawye~] + 


2.6. View ¢ 


If you want to use the editor to look at a file, rather than to make changes, invoke it as 
view instead of vi. This will set the readonly option which will prevent you from accidently 
overwriting the file. 


3. Making simple changes 


3.1. Inserting 


One of the most useful commands is the i (insert) command. After you type i, every- 
thing you type until you hit ESc is inserted into the file. Try this now; position yourself to 
some word in the file and try inserting text before this word. If you are on an dumb terminal it 
will seem, for a minute, that some of the characters in your line have been overwritten, but 
they will reappear when you hit ESC. 


Now try finding a word which can, but does not, end in an ‘s’. Position yourself at this 
word and type e (move to end of word), then a for append and then ‘sESC’ to terminate the 
textual insert. This sequence of commands can be used to easily pluralize a word. 


Try inserting and appending a few times to make sure you understand how this works; i 
placing text to the left of the cursor, a to the right. 


It is often the case that you want to add new lines to the file you are editing, before or 
after some specific line in the file. Find a line where this makes sense and then give the com- 
mand 0 to create a new line after the line you are on, or the command O to create a new line 
before the line you are on. After you create a new line in this way, text you type up to an ESC 


+ Not available in all v2 editors due to memory constraints. 
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is inserted on the new line. 


Many related editor commands are invoked by the same letter key and differ only in that 
one is given by a lower case key and the other is given by an upper case key. In these cases, 
the upper case key often differs from the lower case key in its sense of direction, with the 
upper case key working backward and/or up, while the lower case key moves forward and/or 
down. 


Whenever you are typing in text, you can give many lines of input or just a few charac- 
ters. To type in more than cne line of text, hit a RETURN at the middle of your input. A new 
line will be created for text, and you can continue to type. If you are on a slow and dumb ter- 
minal the editor may choose to wait to redraw the tail of the screen, and will let you type over 
the existing screen lines. This avoids the lengthy delay which would occur if the editor 
attempted to keep the tail of the screen always up to date. The tail of the screen will be fixed 
up, and the missing lines will reappear, when you hit ESC. 


While you are inserting new text, you can use the characters you normally use at the sys- 
tem command level (usually “H or #) to backspace over the last character which you typed, 
and the character which you use to kill input lines (usually @, “X, or “U) to erase the input 
you have typed on the current line.f The character “W will erase a whole word and leave you 
after the space after the previous word; it is useful for quickly backing up in an insert. 


Notice that when you backspace during an insertion the characters you backspace over are 
not erased; the cursor moves backwards, and the characters remain on the display. This ts 
often useful if you are planning to type in something similar. In any case the characters disap- 
pear when when you hit Esc; if you want to get rid of them immediately, hit an ESc and then a 
again. 

Notice also that you can’t erase characters which you didn’t insert, and that you can’t 
backspace around the end of a line. If you need to back up to the previous line to make a 
correction, just hit ESC and move the cursor back to the previous line. After making the 
correction you can return to where you were and use the insert or append command again. 


3.2. Making small corrections 


You can make small corrections in existing text quite easily. Find a single character 
which is wrong or just pick any character. Use the arrow keys to find the character, or get near 
the character with the word motion keys and then either backspace (hit the BS key or “H or 
even just h) or SPACE (using the space bar) until the cursor is on the character which is wrong. 
If the character is not needed then hit the x key; this deletes the character from the file. It is 
analogous to the way you x out characters’ when you make mistakes on a typewriter (except it’s 
not as messy). 


If the character is incorrect, you can replace it with the correct character by giving the 
command rc, where c is replaced by the correct character. Finally if the character which is 
incorrect should be replaced by more than one character, give the command s which substitutes 
a string of characters, ending with ESC, for it. If there are a small number of characters which 
are wrong you can precede s with a count of the number of characters to be replaced. Counts 
are also useful with x to specify the number of characters to be deleted. 


3.3. More corrections: operators 


You already know almost enough to make changes at a higher level. All you need to 
know now is that the d key acts as a delete operator. Try the command dw to delete a word. 
Try hitting . a few times. Notice that this repeats the effect of the dw. The command. repeats 
the last command which made a change. You can remember it by analogy with an ellipsis *...’. 


ft In fact, the character “H (backspace) always works to erase the last input character here, regardless of what 
your erase character is. 
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Now try db. This deletes a word backwards, namely the preceding word. Try dSPACE. 
This deletes a single character, and is equivalent to the x command. 


Another very useful operator is c or change. The command ew thus changes the text of a 
single word. You follow it by the replacement text ending with an Esc. Find a word which you 
can change to another, and try this now. Notice that the end of the text to be changed was 
marked with the character ‘S$’ so that you can see this as you are typing in the new material. 


3.4. Operating on lines 


It is often the case that you want to operate on lines. Find a line which you want to 
delete, and type dd, the d operator twice. This will delete the line. If you are on a dumb ter- 
minal, the editor may just erase the line on the screen, replacing it with a line with only an @ 
on it. This line does not correspond to any line in your file, but only acts as a place holder. It 
helps to avoid a lengthy redraw of the rest of the screen which would be necessary to close up 
the hole created by the deletion on a terminal without a delete line capability. 


Try repeating the c operator twice; this will change a whole line, erasing its previous con- 
tents and replacing them with text you type up to an ESC.f 


You can delete or change more than one line by preceding the dd or ce with a count, i.e. 
5dd deletes 5 lines. You can also give a command like dL to delete all the lines up to and 
including the last line on the screen, or d3L to delete through the third from the bottom line. 
Try some commands like this now.* Notice that the editor lets you know when you change a 
large number of lines so that you can see the extent of the change. The editor will also always 
tell you when a change you make affects text which you cannot see. 


3.5. Undoing 


Now suppose that the last change which you made was incorrect; you could use the insert, 
delete and append commands to put the correct material back. However, since it is often the 
case that we regret a change or make a change incorrectly, the editor provides a u (undo) com- 
mand to reverse the last change which you made. Try this a few times, and give it twice in a 
row to notice that an u also undoes a u. 


The undo command lets you reverse only a single change. After you make a number of 
changes to a line, you may decide that you would rather have the original state of the line back. 
The U command restores the current line to the state before you started changing it. 


You can recover text which you delete, even if undo will not bring it back; see the section 
on recovering lost text below. 


3.6. Summary 


SPACE advance the cursor one position 

“H backspace the cursor 

“Ww erase a word during an insert 

erase your erase (usually “H or #), erases a character during an insert 
kill your kill (usually @, “X, or “U), kills the insert on this line 


repeats the changing command 

opens and inputs new lines, above the current 
undoes the changes you made to the current line 
appends text after the cursor 

changes the object you specify to the following text 


am ao’ 


ft The command § is a convenient synonym for for cc, by analogy with s. Think of S as a substitute on 
lines, while s is a substitute on characters. 

* One subtle point here invoives using the / search after a d. This will normally delete characters from the 
current position to the point of the match. If what is desired is to delete whole lines including the two points, 
give the pattern as /pat/+0, a line address. 
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deletes the object you specify 

inserts text before the cursor 

opens and inputs new lines, below the current 
undoes the last change 


ij (—) = 2. 


4. Moving about; rearranging and duplicating text 


4.1. Low level character motions 


Now move the cursor to a line where there is a punctuation or a bracketing character such 
as a parenthesis or a comma or period. Try the command fx where x is this character. This 
command finds the next x character to the right of the cursor in the current line. Try then hit- 
ting a ;, which finds the next instance of the same character. By using the f command and then 
a sequence of ;’s you can often get to a particular place in a line much faster than with a 
sequence of word motions or SPACEs. There is also a F command, which is like f, but searches 
backward. The ; command repeats F also. 


When you are operating on the text in a line it is often desirable to deal with the charac- 
ters up to, but not including, the first instance of a character. Try dfx for some x now and 
notice that the x character is deleted. Undo this with u and then try dtx, the t here stands for 
to, i.e. delete up to the next x, but not the x The command T is the reverse of t. 


When working with the text of a single line, an [ moves the cursor to the first non-white 
position on the line, and a $ moves it to the end of the line. Thus $a will append new text at 
the end of the current line. 


Your file may have tab (“I) characters in it. These characters are represented as a number 
of spaces expanding to a tab stop, where tab stops are every 8 positions.* When the cursor is at 
a tab, it sits on the last of the several spaces which represent that tab. Try moving the cursor 
back and forth over tabs so you understand how this works. 


On rare occasions, your file may have nonprinting characters in it. These characters are 
displayed in the same way they are represented in this document, that is with a two character 
code, the first character of which is ‘“°. On the screen non-printing characters resemble a ‘*’ 
character adjacent to another, but spacing or backspacing over the character will reveal that the 
two characters are, like the spaces representing a tab character, a single character. 


The editor sometimes discards control characters, depending on the character and the set- 
ting of the beautify option, if you attempt to insert them in your file. You can get a control 
character in the file by beginning an insert and then typing a “V before the control character. 
The “V quotes the following character, causing it to be inserted directly into the file. 


4.2. Higher level text objects 


In working with a document it is often advantageous to work in terms of sentences, para- 
graphs, and sections. The operations ( and ) move to the beginning of the previous and next 
sentences respectively. Thus the command d) will delete the rest of the current sentence; like- 
wise d( will delete the previous sentence if you are at the beginning of the current sentence, or 
the current sentence up to where you are if you are not at the beginning of the current sen- 
tence. 


A sentence is defined to end at a ‘.’, ‘!’ or ‘?’ which is followed by either the end of a 
line, or by two spaces. Any number of closing ‘)’, ‘}’, ‘”’ and °*’’ characters may appear after 
the ‘.’, ‘!’ or ‘?’ before the spaces or end of line. 


The operations { and } move over paragraphs and the operations [{ and ]] move over sec- 
tions.f . . 


* This is settable by a command of the form :se ts™xcr, where x is 4 to set tabstops every four columns. 
This has effect on the screen representation within the editor. 
t The I! and |] operations require the operation character to be doubled because they can move the cursor far 


3-62 An Introduction to Display Editing with Vi 


A paragraph begins after each empty line, and also at each of a set of paragraph macros, 
specified by the pairs of characters in the definition of the string valued option paragraphs. The 
default setting for this option defines the paragraph macros of the —ms and —mm macro pack- 
ages, i.e. the ‘IP’, ‘.LP’, ‘.PP’ and ‘.QP’, ‘.P’ and ‘.LI’ macros.¢ Each paragraph boundary is 
also a sentence boundary. The sentence and paragraph commands can be given counts to 
operate over groups of sentences and paragraphs. 


Sections in the editor begin after each macro in the sections option, normally ‘.NH’, ‘.SH’, 
‘.H’ and ‘.HU’, and each line with a formfeed “L in the first column. Section boundaries are 
always line and paragraph boundaries also. 


Try experimenting with the sentence and paragraph commands until you are sure how 
they work. If you have a large document, try looking through it using the section commands. 
The section commands interpret a preceding count as a different window size in which to 
redraw the screen at the new location, and this window size is the base size for newly drawn 
windows until another size is specified. This is very useful if you are on a slow terminal and 
are looking for a particular section. You can give the first section command a small count to 
then see each successive section heading in a small window. 


4.3. Rearranging and duplicating text 


The editor has a single unnamed buffer where the last deleted or changed away text is 
saved, and a set of named buffers a—z which you can use to save copies of text and to move 
text around in your file and between files. 


The operator y yanks a copy of the object which follows into the unnamed buffer. If pre- 
ceded by a buffer name, "xy, where x here is replaced by a letter az, it places the text in the 
named buffer. The text can then be put back in the file with the commands p and P; p puts 
the text after or below the cursor, while P puts the text before or above the cursor. 


If the text which you yank forms a part of a line, or is an object such as a sentence which 
partially spans more than one line, then when you put the text back, it will be placed after the 
cursor (or before if you use P). If the yanked text forms whole lines, they will be put back as 
whole lines, without changing the current line. In this case, the put acts much like a o or O 
command. 


Try the command YP. This makes a copy of the current line and leaves you on this copy, 
which is placed before the current line. The command Y is a convenient abbreviation for yy. 
The command Yp will alsq make a copy of the current line, and place it after the current line. 
You can give Y a count of lines to yank, and thus duplicate several lines; try 3YP. 


To move text within the buffer, you need to delete it in one place, and put it back in 
another. You can precede a delete operation by the name of a buffer in which the text is to be 
stored as in “a5dd deleting 5 lines into the named buffer a. You can then move the cursor to 
the eventual resting place of the these lines and do a “ap or "aP to put them back. In fact, you 
can switch and edit another file before you put the lines back, by giving a command of the form 
:@ mameCR where name is the name of the other file you want to edit. You will have to write 
back the contents of the current editor buffer (or discard them) if you have made changes 
before the editor will let you switch to the other file. An ordinary delete command saves the 
text in the unnamed buffer, so that an ordinary put can move it elsewhere. However, the 
unnamed buffer is lost when you change files, so to move text from one file to another you 
should use an unnamed buffer. 


from where it currently is. While it is easy to get back with the command “, these commands would still be 
frustrating if they were easy to hit accidentally. 

+ You can easily change or extend this set of macros by assigning a different string to the paragraphs option 
in your EXINIT. See section 6.2 for details. The *.bp’ directive is also considered to start a paragraph. 
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4.4. Summary. 


t first non-white on line 

$ end of line 

) forward sentence 

} forward paragraph 

N forward section 

( backward sentence 

{ backward paragraph 

(I backward section 

fx find x forward in line 

p put text back, after cursor or below current line 
y yank operator, for copies and moves 

tx up to x forward, for operators 

Fx f backward in line 

P put text back, before cursor or above current line 
Tx t backward in line 


5. High level commands 


5.1. Writing, quitting, editing new files 


So far we have seen how to enter wi and to write out our file using either ZZ or :wcr. 
The first exits from the editor, (writing if changes were made), the second writes and stays in 
the editor. 


If you have changed the editor’s copy of the file but do not wish to save your changes, 
either because you messed up the file or decided that the changes are not an improvement to 
the file, then you can give the command :q!cCR to quit from the editor without writing the 
changes. You can also reedit the same file (starting over) by giving the command :e!cr. These 
commands should be used only rarely, and with caution, as it is not possible to recover the 
changes you have made after you discard them in this manner. 


You can edit a different file without leaving the editor by giving the command :e namecr. 
If you have not written out your file before you try to do this, then the editor will tell you this, 
and delay editing the other file. You can then give the command :wcR to save your work and 
then the :e mameCR command again, or carefully give the command :e! namecr, which edits 
the other file discarding the changes you have made to the current file. To have the editor 
automatically save changes, include ser autowrite in your EXINIT, and use :n instead of :e. 


5.2. Escaping to a shell 


You can get to a shell to execute a single command by giving a wi command of the form 
:!cmacR. The system will run the single command cmd and when the command finishes, the 
editor will ask you to hit a RETURN to continue. When you have finished looking at the output 
on the screen, you should hit RETURN and the editor will clear the screen and redraw it. You 
can then continue editing. You can also give another : command when it asks you for a 
RETURN; in this case the screen will not be redrawn. 


If you wish to execute more than one command in the shell, then you can give the com- 
mand :shcr. This will give you a new shell, and when you finish with the shell, ending it by 
typing a “D, the editor will clear the screen and continue. 


On systems which support it, ~Z will suspend the editor and return to the (top level) 
shell. When the editor is resumed, the screen will be redrawn. 
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5.3. Marking and returning 


The command “ returned to the previous place after a motion of the cursor by a com- 
mand such as /, ? or G. You can also mark lines in the file with single letter tags and return to 
these marks later by naming the tags. Try marking the current line with the command mx, 
where you should pick some letter for x, say ‘a’. Then move the cursor to a different line (any 
way you like) and hit ‘a. The cursor will return to the place which you marked. Marks last 
only until you edit another file. 


When using operators such as d and referring to marked lines, it is often desirable to 
delete whole lines rather than deleting to the exact position in the line marked by m. In this 
case you can use the form ‘x rather than ‘x. Used without an operator, ‘x will move to the first 
non-white character of the marked line; similarly “ moves to the first non-white character of 
the line containing the previous context mark “. 


5.4. Adjusting the screen 


If the screen image is messed up because of a transmission error to your terminal, or 
because some program other than the editor wrote output to your terminal, you can hit a “L, 
the ASCII form-feed character, to cause the screen to be refreshed. 


On a dumb terminal, if there are @ lines in the middle of the screen as a result of line 
deletion, you may get rid of these lines by typing “R to cause the editor to retype the screen, 
closing up these holes. 


Finally, if you wish to place a certain line on the screen at the top middle or bottom of 
the screen, you can position the cursor to that line, and then give a z command. You should 
follow the z command with a RETURN if you want the line to appear at the top of the window, a 
. if you want it at the center, or a — if you want it at the bottom. (z., z-, and z+ are not avail- 
able on all v2 editors.) 


6. Special topics 


6.1. Editing on slow terminals 


When you are on a siow terminal, it is important to limit the amount of output which is 
generated to your screen so that you will not suffer long delays, waiting for the screen to be 
refreshed. We have already pointed out how the editor optimizes the updating of the screen 
during insertions on dumb terminals to limit the delays, and how the editor erases lines to @ 
when they are deleted on dumb terminals. 


The use of the slow terminal insertion mode is controlled by the s/iowopen option. You 
can force the editor to use this mode even on faster terminals by giving the command :se 
slowcR. If your system is sluggish this helps lessen the amount of output coming to your ter- 
minal. You can disable this option by :se noslowCr. 


The editor can simulate an intelligent terminal on a dumb one. Try giving the command 
:se redrawCR. This simulation generates a great deal of output and is generally tolerable only 
on lightly loaded systems and fast terminals. You can disable this by giving the command 
:se noredrawcr. 


The editor also makes editing more pleasant at low speed by starting editing in a small 
window, and letting the window expand as you edit. This works particularly well on intelligent 
terminals. The editor can expand the window easily when you insert in the middle of the 
screen on these terminals. If possible, try the editor on an intelligent terminal to see how this 
works. 

You can control the size of the window which is redrawn each time the screen is cleared 
by giving window sizes as argument to the commands which cause large screen motions: 


:/ 2 TW’ 


Thus if you are searching for a particular instance of a common string in a file you can precede 
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the first search command by a small number, say 3, and the editor will draw three line windows 
around each instance of the string which it locates. 


You can easily expand or contract the window, placing the current line as you choose, by 
giving a number on a z command, after the z and before the following RETURN, . or —. Thus 
the command 25. redraws the screen with the current line in the center of a five line window. ft 


If the editor is redrawing or otherwise updating large portions of the display, you can 
interrupt this updating by hitting a DEL or RUB as usual. If you do this you may partially con- 
fuse the editor about what is displayed on the screen. You can still edit the text on the screen 
if you wish; clear up the confusion by hitting a “L; or move or search again, ignoring the 
current state of the display. 


See section 7.8 on open mode for another way to use the vi command set on slow termi- 
nals. 


6.2. Options, set, and editor startup files 


. The editor has a set of options, some of which have been mentioned above. The most 
useful options are given in the following table. 


Name Default Description 

autoindent  noai Supply indentation automatically 

autowrite noaw Automatic write before :n, :ta, “f, ! 
ignorecase noic — Ignore case in searching 

lisp nolisp ( { ) } commands deal with S-expressions 
list nolist Tabs print as “I; end of lines marked with $ 
magic nomagic The characters . [ and * are special in scans 
number nonu Lines are displayed prefixed with line numbers 
paragraphs paramIPLPPPQPbpP LI Macro names which start paragraphs 

redraw nore Simulate a smart terminal on a dumb one 
sections sect = NHSHH HU Macro names which start new sections 
shiftwidth sw=8 Shift distance for <, > and input “D and “T 
showmatch nosm Show matching ( or { as ) or } is typed 
slowopen slow | Postpone display updates during inserts 

term dumb The kind of terminal you are using. 


The options are of three kinds: numeric options, string options, and toggle options. You 
can set numeric and string options by a statement of the form 


set opt= val 
and toggle options can be set or unset by statements of one of the forms 


set opt 
set noopt 


These statements can be placed in your EXINIT in your environment, or given while you are 
running vi by preceding them with a : and following them with a cr. 


You can get a list of all options which you have changed by the command :setcrR, or the 
value of a single option by the command :set opf?cR. A list of all possible options and their 
values is generated by :set allCR. Set can be abbreviated se. Multiple options can be placed on 
one line, e.g. :se ai aw nuCr. 


Options set by the set command only last while you stay in the editor. It is common to 
want to have certain options set whenever you use the editor. This can be accomplished by 
creating a list of ex commandsf which are to be run every time you Start up ex, edit, or vii A 


t Note that the command 5z. has an entirely different effect, placing line 5 in the center of a new window. 
¢ All commands which start with : are ex commands. 
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typical list includes a set command, and possibly a few map commands (on v3 editors). Since 
it is advisable to get these commands on one line, they can be separated with the | character, for 
example: 


set ai aw tersemap @ ddmap # x 


which sets the options autoindent, autowrite, terse, (the set command), makes @ delete a line, 
(the first map), and makes # delete a character, (the second map). (See section 6.9 for a 
description of the map command, which only works in version 3.) This string should be placed 
in the variable EXINIT in your environment. If you use csh, put this line in the file ./ogin in 
your home directory: 


setenv EXINIT ‘set ai aw tersemap @ ddimap # x’ 
If you use the standard v7 shell, put these lines in the file .profile in your home directory: 


EXINIT =’set ai aw tersemap @ ddimap # x’ 
export EXINIT 


On a version 6 system, the concept of environments is not present. In this case, put the line in 
the file .exrc in your home directory. 


set ai aw tersemap @ ddimap # x 
Of course, the particulars of the line would depend on which options you wanted to set. 


6.3. Recovering lost lines 


You might have a serious problem if you delete a number of lines and then regret that 
they were deleted. Despair not, the editor saves the last 9 deleted blocks of text in a set of 
numbered registers 1—9. You can get the n’th previous deleted text back in your file by the 
command "”p. The ” here says that a buffer name is to follow, n is the number of the buffer 
you wish to try (use the number 1 for now), and p is the put command, which puts text in the 
buffer after the cursor. If this doesn’t bring back the text you wanted, hit u to undo this and 
then . (period) to repeat the put command. In general the . command will repeat the last 
change you made. As a special case, when the last command refers to a numbered text buffer, 
the . command increments the number of the buffer before repeating the command. Thus a 
sequence of the form 


"lpu.u.u. 


will, if repeated long enough, show you all the deleted text which has been saved for you. You 
can omit the u commands here to gather up all this text in the buffer, or stop after any . com- 
mand to keep just the then recovered text. The command P can also be used rather than p to 
put the recovered text before rather than after the cursor. 


6.4. Recovering lost files 


If the system crashes, you can recover the work you were doing to within a few changes. 
You will normally receive mail when you next login giving you the name of the file which has 
been saved for you. You should then change to the directory where you were when the system 
crashed and give a command of the form: 


% yi —r name 


replacing name with the name of the file which you were editing. This. will recover your work 
to a point near where you left off.f 


+ In rare cases, some of the lines of the file may be lost. The editor will give you the numbers of these lines 
and the text of the lines will be replaced by the string ‘LOST’. These lines will almost always be among the 
last few which you changed. You can either choose to discard the changes which you made (if they are easy 
to remake) or to replace the few lost lines by hand. 
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You can get a listing of the files which are saved for you by giving the command: 
% vir 


If there is more than one instance of a particular file saved, the editor gives you the newest 
instance each time you recover it. You can thus get an older saved copy back by first recover- 
ing the newer copies. 


For this feature to work, wi must be correctly installed by a super user on your system. 
and the mail program must exist to receive mail. The invocation ‘‘wi -r’’ will not always list all 
saved files, but they can be recovered even if they are not listed. 


6.5. Continuous text input 


When you are typing in large amounts of text it is convenient to have lines broken near 
the right margin automatically. You can cause this to happen by giving the command ‘se 
wm=10cCR. This causes all lines to be broken at a space at least 10 columns from the right 
hand edge of the screen.” 


If the editor breaks an input line and you wish to put it back together you can tell it to 
join the lines with J. You can give J a count of the number of lines to be joined as in 3J to 
join 3 lines. The editor supplies white space, if appropriate, at the juncture of the joined lines, 
and leaves the cursor at this white space. You can kill the white space with x if you don’t want 
it. 


6.6. Features for editing programs 


The editor has a number of commands for editing programs. The thing that most distin- 
guishes editing of programs from editing of text is the desirability of maintaining an indented 
structure to the body of the program. The editor has a autoindent facility for helping you gen- 
erate correctly indented programs. 


To enable this facility you can give the command :se aiCR. Now try opening a new line 
with o and type some characters on the line after a few tabs. If you now start another line, 
notice that the editor supplies white space at the beginning of the line to line it up with the pre- 
vious line. You cannot backspace over this indentation, but you can use “D key to backtab 
over the supplied indentation. 


Each time you type “D you back up one position, normally to an 8 column boundary. 
This amount is settable; the editor has an option called shiftwidth which you can set to change 
this value. Try giving the command :se sw=4CR and then experimenting with autoindent 
again. 


For shifting lines in the program left and right, there are operators < and >. These shift 
the lines you specify right or left by one shiftwidth, Try << and >> which shift one line left 
or right, and <L and >L shifting the rest of the display left and right. 


If you have a complicated expression and wish to see how the parentheses match, put the 
cursor at a left or right parenthesis and hit %. This will show you the matching parenthesis. 
This works also for braces { and }, and brackets [ and J. 


If you are editing C programs, you can use the [[{ and J] keys to advance or retreat to a 
line starting with a {, i.e. a function declaration at a time. When ]] is used with an operator it 
stops after a line which starts with }; this is sometimes useful with y]l. 


* This feature is not available on some v2 editors. In v2 editors where it is available, the break can only oc- 
cur to the right of the specified boundary instead of to the left. 


3-68 An Introduction to Display Editing with Vi 


6.7. Filtering portions of the buffer 


You can run system commands over portions of the buffer using the operator !. You can 
use this to sort lines in the buffer, or to reformat portions of the buffer with a pretty-printer. 
Try typing in a list of random words, one per line and ending them with a blank line. Back up 
to the beginning of the list, and then give the command !}sortcR. This says to sort the next 
paragraph of material, and the blank line ends a paragraph. 


6.8. Commands for editing LISPT 


If you are editing a LISP program you should set the option lisp by doing :se lispcr. This 
changes the ( and ) commands to move backward and forward over s-expressions. The { and } 
commands are like ( and ) but don’t stop at atoms. These can be used to skip to the next list, 
or through a comment quickly. 


The autoindent option works differently for LISP, supplying indent to align at the first argu- 
ment to the last open list. If there is no such argument then the indent is two spaces more 
than the last level. 


There i is another option which is useful for typing in LISP, the showmatch option. Try set- 
ting it with :se smcR and then try typing a ‘(’ some words and then a ‘)’. Notice that the cur- 
sor shows the position of the ‘(’ which matches the ‘)’ briefly. This happens only if the match- 
ing ‘( is on the screen, and the cursor stays there for at most one second. 


The editor also has an operator to realign existing lines as though they had been typed in 
with lisp and autoindent set. This is the = operator. Try the command =% at the beginning of 
a function. This will realign all the lines of the function declaration. 


When you are editing Lisp,, the {{ and }] advance and retreat to lines beginning with a (, 
and are useful for dealing with entire function definitions. 


> i 


6.9. Macros? 


Vi has a parameteriess macro facility, which lets you set it up so that when you hit a single 
keystroke, the editor will act as though you had hit some longer sequence of keys. You can set 
this up if you find yourself typing the same sequence of commands repeatedly. 


Briefly, there are two flavors of macros: 


a) Ones where you put the macro body in a buffer register, say x You can then type @x to. 
invoke the macro. The @ may be followed by another @ to repeat the last macro. 


b) You can use the map command from wi (typically in your EXINIT) with a command of the 
form: 


‘map /hs rhsCR 


mapping /hs into rhs. There are restrictions: /hs should be one keystroke (either 1 charac- 
ter or one function key) since it must be entered within one second (unless notimeout is 
set, in which case you can type it as slowly as you wish, and wi will wait for you to finish it 
before it echoes anything). The /hs can be no longer than 10 characters, the rhs no longer 
than 100. To get a space, tab or newline into /hs or rhs you should escape them with a “V. 
(It may be necessary to double the “V if the map command is given inside vi, rather than 
in ex.) Spaces and tabs inside the rhs need not be escaped. 


Thus to make the q key write and exit the editor, you can give the command 
‘map q :wq"V“VCR CR . 


which means that whenever you type q, it will be as though you had typed the four characters 
:wqcCR. A “V’s is needed because without it the CR would end the : command, rather than 


+ The tisp features are not available on some v2 editors: due fo memory constraints. 
+ The macro feature is available only in version 3 editors. 
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becoming part of the map definition. There are two “V’s because from within vi, two “V’s must 
be typed to get one. The first CR is part of the rhs, the second terminates the : command. 


Macros can be deleted with 


unmap l!hs 


If the /hs of a macro is ‘*#0”’ through ‘*#9’’, this maps the particular function key instead 
of the 2 character ‘‘#’’ sequence. So that terminals without function keys can access such 
definitions, the form ‘‘#x’’ will mean function key x on all terminals (and need not be typed 
within one second.) The character ‘‘#’’ can be changed by using a macro in the usual way: 


‘map “V°V"I # 
to use tab, for example. (This won’t affect the map command, which still uses #, but just the 
invocation from visual mode. 
The undo command reverses an entire macro call as a unit, if it made any changes. 
Placing a ‘!’ after the word map causes the mapping to apply to input mode, rather than 
command mode. Thus, to arrange for “T to be the same as 4 spaces in input mode, you can 


type: 
‘map “T “Vbbbb 


where & is a blank. The “V is necessary to prevent the blanks from being taken as white space 
between the /hs and rhs. 


7. Word Abbreviations ++ 


A feature similar to macros in input mode is word abbreviation. This allows you to type a 
short word and have it expanded into a longer word or words. The commands are :abbreviate 
and :unabbreviate (:ab and :una) and have the same syntax as :map. For example: 


‘ab eecs Electrical Engineering and Computer Sciences 


causes the word ‘eecs’ to always be changed into the phrase ‘Electrical Engineering and Com- 
puter Sciences’. Word abbreviation is different from macros in that only whole words are 
affected. If ‘eecs’ were typed as part of a larger word, it would be left alone. Also, the partial 
word is echoed as it is typed. There is no need for an abbreviation to be a single keystroke, as 
it should be with a macro. 


7.1. Abbreviations 


The editor has a number of short commands which abbreviate longer commands which we 
have introduced here. You can find these commands easily on the quick reference card. They 
often save a bit of typing and you can learn them as convenient. 


8. Nitty-gritty details 


8.1. Line representation in the display 


The editor folds long logical lines onto many physical lines in the display. Commands 
which advance lines advance logical lines and will skip over all the segments of a line in one 
motion. The command | moves the cursor to a specific column, and may be useful for getting 
near the middle of a long line to split it in half. Try 80| on a line which is more than 80 
columns long.f 


The editor only puts full lines on the display; if there is not enough room on the display 
to fit a logical line, the editor leaves the physical line empty, placing only an @ on the line as a 


#t Version 3 only. > 
f You can make long lines very easily by using J to join together short lines. 
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place holder. When you delete lines on a dumb terminal, the editor will often just clear the 
lines to @ to save time (rather than rewriting the rest of the screen.) You can always maximize 
the information on the screen by giving the “R command. 


If you wish, you can have the editor place line numbers before each line on the display. 
Give the command :se nuCR to enable this, and the command :se nonuCR to turn it off. You 
can have tabs represented as “I and the ends of lines indicated with ‘S$’ by giving the command 
:se listCR; :se nolistCR turns this off. 


Finally, lines consisting of only the character ‘~’ are displayed when the last line in the file 
is in the middle of the screen. These represent physical lines which are past the logical end of 
file. 


8.2. Counts 


Most vi commands will use a preceding count to affect their behavior in some way. The 
following table gives the common ways in which the counts are used: 


new window size ae ec | (es | alias 
scroll amount ‘D *U 
line/column number z G | 
repeat effect most of the rest 


The editor maintains a notion of the current default window size. On terminals which run 
at speeds greater than 1200 baud the editor uses the full terminal screen. On terminals which 
are slower than 1200 baud (most dialup lines are in this group) the editor uses 8 lines as the 
default window size. At 1200 baud the default is 16 lines. 


This size is the size used when the editor clears and refills the screen after a search or 
other motion moves far from the edge of the current window. The commands which take a 
new window size as count ail often cause the screen to be redrawn. If you anticipate this, but 
do not need as large a window as you are currently using, you may wish to change the screen 
size by specifying the new size before these commands. In any case, the number of lines used 
on the screen will expand if you move off the top with a — or similar command or off the bot- 
tom with a command such as RETURN or “D. The window will revert to the last specified size 
the next time it is cleared and refilled.f 


The scroll commands “D and “U likewise remember the amount of scroll last specified, 
using half the basic window size initially. The simple insert commands use a count to specify a 
repetition of the inserted text. Thus 10a*+-—-—-— ESC will insert a grid-like string of text. A 
few commands also use a preceding count as a line or column number. 


Except for a few commands which ignore any counts (such as “R), the rest of the editor 
commands use a count to indicate a simple repetition of their effect. Thus Sw advances five 
words on the current line, while SRETURN advances five lines. A very useful instance of a 
count as a repetition is a count given to the . command, which repeats the last changing com- 
mand. If you do dw and then 3., you will delete first one and then three words. You can then 
delete two more words with 2.. 


8.3. More file manipulation commands 


The following table lists the file manipulation commands which you can use when you are 
in vi. All of these commands are followed by a CR or ESC. The most basic commands are :w 
and :e. A normal editing session on a single file will end with a ZZ command. If you are edit- 
ing for a long period of time you can give :w commands occasionally after major amounts of 
editing, and then finish with a ZZ. When you edit more than one file, you can finish with one 


f But not by a “L which just redraws the screen as it is. 
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:w write back changes 

:wq write and quit 

°X write (if necessary) and quit (same as ZZ). 
te name edit file name 

se! reedit, discarding changes 

ce + name _— edit, starting at end 

ce Fn edit, starting at line n 

ie # edit alternate file 

sw name write file name 

tw! name overwrite file name 

:x,yw name write lines x through y to name 

:r name read file name into buffer 

x !cmd read output of cmd into buffer 

wn edit next file in argument list 

in! edit next file, discarding changes to current 
mm args specify new argument list 

sta fag edit file containing tag tag, at tag 


with a :w and start editing a new file by giving a :e command, or set autowrite and use :n 
<file>. 


If you make changes to the editor’s copy of a file, but do not wish to write them back, 
then you must give an ! after the command you would otherwise use; this forces the editor to 
discard any changes you have made. Use this carefully. 


The :e command can be given a + argument to start at the end of the file, or a + argu- 
ment to start at line 7. In actuality, 7 may be any editor command not containing a space, use- 
fully a scan like +/pat or +?pat. In forming new names to the e command, you can use the 
character % which is replaced by the current file name, or the character # which is replaced by 
the alternate file name. The alternate file name is generally the last name you typed other than 
the current file. Thus if you try to do a :e and get a diagnostic that you haven’t written the file, 
you can give a :w command and then a :e # command to redo the previous :e. 


You can write part of the buffer to a file by finding out the lines that bound the range to 
be written using “G, and giving these numbers after the : and before the w, separated by ,’s. 
You can also mark these lines with m and then use an address of the form ’x,’y on the w com- 
mand here. 


You can read another file into the buffer after the current line by using the :r command. 
You can similarly read in the output from a command, just use !cmd instead of a file name. 


If you wish to edit a set of files in succession, you can give all the names on the command 
line, and then edit each one in turn using the command :n. It is also possible to respecify the 
list of files to be edited by giving the :n command a list of file names, or a pattern to be 
expanded as you would have given it on the initial vi command. 


If you are editing large programs, you will find the :ta command very useful. It utilizes a 
data base of function names and their locations, which can be created by programs such as 
ctags, to quickly find a function whose name you give. If the :ta command will require the edi- 
tor to switch files, then you must :w or abandon any changes before switching. You can repeat 
the :ta command without any arguments to look for the same tag again. (The tag feature is not 
available in some v2 editors.) 


8.4. More about searching for strings 


When you are searching for strings in the file with / and ?, the editor normally places you 
at the next or previous occurrence of the string. If you are using an operator such as d, c or y, 
then you may well wish to affect lines up to the line before the line containing the pattern. 
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You can give a search of the form /pat/—n to refer to the »’th line before the next line con- 
taining pat, or you can use + instead of — to refer to the lines after the one containing par. If 
you don’t give a line offset, then the editor will affect characters up to the match place, rather 
than whole lines; thus use ‘‘+0”’ to affect to the line which matches. 


You can have the editor ignore the case of words in the searches it does by giving the 
command :se iccR. The command :se noiccCr turns this off. 


Strings given to searches may actually be regular expressions. If you do not want or need 
this facility, you should 


set nomagic 


in your EXINIT. In this case, only the characters f and § are special in patterns. The character 
\ is also then special (as it is most everywhere in the system), and may be used to get at the an 
extended pattern matching facility. It is also mecessary to use a \ before a / in a forward scan 
or a ? in a backward scan, in any case. The following table gives the extended forms when 
magic is set. 


t at beginning of pattern, matches beginning of line 
$ at end of pattern, matches end of line 

: matches any character 

\< matches the beginning of a word 

\> matches the end of a word 

{ str] matches any single character in sir 


(f str] matches any single character not in sir 
{x-—y] | matches any character between x and y 
. matches any number of the preceding pattern 


If you use nomagic mode, then the . { and * primitives are given with a preceding \. 


8.5. More about input mode 


There are a number of characters which you can use to make corrections during input 
mode. These are summarized in the following table. 


“H deletes the last input character 

“WwW deletes the last input word, defined as by b 
erase your erase character, same as “H 

kill your kill character, deletes the input on this line 


\ escapes a following “H and your erase and kill 
ESC ends an insertion 

DEL interrupts an insertion, terminating it abnormally 
CR Starts a new line 

“D backtabs over autoindent 


o°D kills all the autoindent 
t-D same as 0°D, but restores indent next line 
“V quotes the next non-printing character into the file 


The most usual way of making corrections to input is by typing “H to correct a single 
character, or by typing one or more “W’s to back over incorrect words. If you use # as your 
erase character in the normal system, it will work like “H. 


Your system kill character, normally @, “X or “U, will erase all the input you have given 
on the current line. In general, you can neither erase input back around a line boundary nor 
can you erase characters which you did not insert with this insertion command. To make 
corrections on the previous line after a new line has been started you can hit ESC to end the 
insertion, move over and make the correction, and then return to where you were to continue. 
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The command A which appends at the end of the current line is often useful for continuing. 


If you wish to type in your erase or kill character (say # or @) then you must precede it 
with a \, just as you would do at the normal system command level. A more general way of 
typing non-printing characters into the file is to precede them with a “V. The “V echoes as a { 
character on which the cursor rests. This indicates that the editor expects you to type a control 
character. In fact you may type any character and it will be inserted into the file at that point.* 


If you are using autoindent you can backtab over the indent which it supplies by typing a 
“D. This backs up to a shiftwidth boundary. This only works immediately after the supplied 
autoindent. 


When you are using autoindent you may wish to place a label at the left margin of a line. 
The way to do this easily is to type [ and then “D. The editor will move the cursor to the left 
margin for one line, and restore the previous indent on the next. You can also type a 0 fol- 
lowed immediately by a “D if you wish to kill all the indent and not have it come back on the 
next line. 


8.6. Upper case only terminals 


If your terminal has only upper case, you can still use vi by using the normal system con- 
vention for typing on such a terminal. Characters which you normally type are converted to 
lower case, and you can type upper case letters by preceding them with a \. The characters { ~ } 
|‘ are not available on such terminals, but you can escape them as \( \f \) \! \’.. These charac- 
ters are represented on the display in the same way they are typed. + 


8.7. Vi and ex 


Vi is actually one mode of editing within the editor ex. When you are running wi you can 
escape to the line oriented editor of ex by giving the command Q. All of the : commands 
which were introduced above are available in ex. Likewise, most ex commands can be invoked 
from vi using :. Just give them without the : and follow them with acr. 


In rare instances, an internal error may occur in w. In this case you will get a diagnostic 
and be left in the command mode of ex. You can then save your work and quit if you wish by 
giving a command x after the : which ex prompts you with, or you can reenter w by giving exa 
vi command. 


There are a number of things which you can do more easily in ex than in wi. Systematic 
changes in line oriented material are particularly easy. You can read the advanced editing docu- 
ments for the editor ed to find out a lot more about this style of editing. Experienced users 
often mix their use of ex command mode and wi command mode to speed the work they are 
doing. 


8.8. Open mode: vi on hardcopy terminals and ‘“‘glass tty’s’’ ¢ 


If you are on a hardcopy terminal or a terminal which does not have a cursor which can 
move off the bottom line, you can still use the command set of wi, but in a different mode. 
When you give a vi command, the editor will tell you that it is using oper mode. This name 
comes from the open command in ex, which is used to get into the same mode. 


The only difference between visua/ mode and open mode is the way in which the text is 


* This is not quite true. The implementation of the editor does not allow the NuLL (“@) character to appear 
in files. Also the LF (linefeed or “J) character is used by the editor to separate lines in the file, so it cannot 
appear in the middle of a line. You can insert any other character, however, if you wait for the editor to 
echo the [{ before you type the character. In fact, the editor will treat a following letter as a request for the 
corresponding control character. This is the only way to type “S or “Q, since the system normally uses them 
to suspend and resume output and never gives them to the editor to process. 

+ The \ character you give will not echo until you type another key. 

+ Not available in all v2 editors due to memory constraints. 
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displayed. 

In open mode the editor uses a single line window into the file, and moving backward and 
forward in the file causes new lines to be displayed, always below the current line. Two com- 
mands of vi work differently in open: z and “R. The z command does not take parameters, but 
rather draws a window of context around the current line and then returns you to the current 
line. 


If you are on a hardcopy terminal, the “R command will retype the current line. On such 
terminals, the editor normally uses two lines to represent the current line. The first line is a 
copy of the line as you started to edit it, and you work on the line below this line. When you 
delete characters, the editor types a number of \’s to show you the characters which are deleted. 
The editor also reprints the current line soon after such changes so that you can see what the 
line looks like again. 


It is sometimes useful to use this mode on very slow terminals which can support wi in the 
full screen mode. You can do this by entering ex and using an open command. 
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Appendix: character functions 


This appendix gives the uses the editor makes of each character. The characters are 
presented in their order in the ASCII character set: Control characters come first, then most 
special characters, then the digits, upper and then lower case characters. 


For each character we tell a meaning it has as a command and any meaning it has during 
an insert. If it has only meaning as a command, then only this is discussed. Section numbers 
in parentheses indicate where the character is discussed; a ‘f’ after the section number means 
that the character is mentioned in a footnote. 


“@ 


“H (Bs) 


“I (TAB) 


“J (LF) 
“K 
“L 


“M (cr) 


Not a command character. If typed as the first character of an insertion it is 
replaced with the last text inserted, and the insert terminates. Only 128 char- 
acters are saved from the last insert; if more characters were inserted the 
mechanism is not available. A “@ cannot be part of the file due to the editor 
implementation (7.5f). 


Unused. 


Backward window. A count specifies repetition. Two lines of continuity are 
kept if possible (2.1, 6.1, 7.2). 


Unused. 


As a command, scrolls down a half-window of text. A count gives the number 
of (logical) lines to scroll, and is remembered for future “D and “U commands 
(2.1, 7.2). During an insert, backtabs over autoindent white space at the begin- 
ning of a line (6.6, 7.5); this white space cannot be backspaced over. 


Exposes one more line below the current screen in the file, leaving the cursor 
where it is if possible. (Version 3 only.) 


Forward window. A count specifies repetition. Two lines of continuity are 
kept if possible (2.1, 6.1, 7.2). 


Equivalent to :fCR, printing the current file, whether it has been modified, the 
current line number and the number of lines in the file, and the percentage of 
the way through the file that you are. 


Same as left arrow. (See h). During an insert, eliminates the last input char- 
acter, backing over it but not erasing it; it remains so you can see what you 
typed if you wish to type something only slightly different (3.1, 7.5). 

Not a command character. When inserted it prints as some number of spaces. 
When the cursor is at a tab character it rests at the last of the spaces which 
represent the tab. The spacing of tabstops is controlled by the sadstop option 
(4.1, 6.6). 

Same as down arrow (see j). 

Unused. 

The ASCII formfeed character, this causes the screen to be cleared and redrawn. 
This is useful after a transmission error, if characters typed by a program other 
than the editor scramble the screen, or after output is stopped by an interrupt 
(5.4, 7.2f). 

A carriage return advances to the next line, at the first non-white position in 
the line. Given a count, it advances that many lines (2.3). During an insert, a 
CR causes the insert to continue onto another line (3.1). 

Same as down arrow (see j). 


Unused. 
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P 
Q 


“{ (Esc) 


SPACE 


Same as up arrow (see k). 


Not a command character. In input mode, “Q quotes the next character. the 
same as “V, except that some teletype drivers will eat the “Q so that the editor 
never sees it. 


Redraws the current screen, eliminating logical lines not corresponding to phy- 
sical lines (lines with only a single @ character on them). On hardcopy termi- 
nals in open mode, retypes the current line (5.4, 7.2, 7.8). 


Unused. Some teletype drivers use “S to suspend output until “Qis 


Not a command character. During an insert, with a@utoindent set and at the 
beginning of the line, inserts shiftwidth white space. 


Scrolls the screen up, inverting “D which scrolls down. Counts work as they 
do for “D, and the previous scroll amount is common to both. On a dumb ter- 
minal, “U will often necessitate clearing and redrawing the screen further back 
in the file (2.1, 7.2). 


Not a command character. In input mode, quotes the next character so that it 
is possible to insert non-printing and special characters into the file (4.2, 7.5). 


Not a command character. During an insert, backs up as b would in command 
mode; the deleted characters remain on the display (see “H) (7.5). 


Unused. 


Exposes one more line above the current screen, leaving the cursor where it is 
if possible. (No mnemonic value for this key; however, it is next to “U which 
scrolls up a bunch.) (Version 3 only.) 


If supported by the Unix system, stops the editor, exiting to the top level shell. 
Same as :stopcR. Otherwise, unused. 


Cancels a partially formed command, such as a z when no following character 
has yet been given; terminates inputs on the last line (read by commands such 
as : / and ?); ends insertions of new text into the buffer. If an ESC is given 
when quiescent in command state, the editor rings the bell or flashes the 
screen. You can thus hit ESC if you don’t know what is happening till the edi- 
tor rings the bell. If you don’t know if you are in insert mode you can type 
ESCa, and then material to be input; the material will be inserted correctly 
whether or not you were in insert mode when you started (1.5, 3.1, 7.5). 


Unused. 


Searches for the word which is after the cursor as a tag. Equivalent to typing 


sta, this word, and then a CR. Mnemonically, this command is ‘‘go right to” 
(7.3). 


Equivalent to :e #CR, returning to the previous position in the last edited file, 
or editing a file which you specified if you got a ‘No write since last change 
diagnostic’ and do not want to have to type the file name again (7.3). (You 
have to do a :w before “{ will work in this case. If you do not wish to write 
the file you should do :e! #CR instead.) 


Unused. Reserved as the command character for the Tektronix 4025 and 4027 
terminal. 


Same as right arrow (see 1). 


AM operator, which processes lines from the buffer with reformatting com- 
mands. Follow ! with the object to be processed, and then the command name 
terminated by CR. Doubling ! and preceding it by a count causes count lines to 
be filtered, otherwise the count is passed on to the object after the !. Thus 
2!\/micr reformats the next two paragraphs by running them through the pro- 
gram fmt. If you are working on LISP, the command !%grindcR,” given at the 
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beginning of a function, will run the text of the function through the LISP 
grinder (6.7, 7.3). To read a file or the output of a command into the buffer 
use :r (7.3). To simply execute a command use :! (7.3). 


Precedes a named buffer specification. There are named buffers 1—9 used for 
saving deleted text and named buffers az into which you can place text (4.3, 
6.3) 


The macro character which, when followed by a number, will substitute for a 
function key on terminals without function keys (6.9). In input mode, if this 
is your erase character, it will delete the last character you typed in input 
mode, and must be preceded with a \ to insert it, since it normally backs over 
the last input character you gave. 


Moves to the end of the current line. If you :se listcR, then the end of each 
line will be shown by printing a § after the end of the displayed text in the 
line. Given a count, advances to the count’th following end of line; thus 2$ 
advances to the end of the following line. 


Moves to the parenthesis or brace { } which balances the parenthesis or brace 
at the current cursor position. 


A synonym for :&CR, by analogy with the ex & command. 


When followed by a ° returns to the previous context at the beginning of a 
line. The previous context is set whenever the current line is moved in a 
non-relative way. When followed by a letter az, returns to the line which 
was marked with this letter with a m command, at the first non-white character 
in the line. (2.2, 5.3). When used with an operator such as d, the operation 
takes place over complete lines; if you use “, the operation takes place from the 
exact marked place to the current cursor position within the line. 


Retreats to the beginning of a sentence, or to the beginning of a LISP s- 
expression if the lisp option is set. A sentence ends at a.! or ? which is fol- 
lowed by either the end of a line or by two spaces. Any number of closing ) | 
" and ° characters may appear after the . ! or ?, and before the spaces or end of 
line. Sentences also begin at paragraph and section boundaries (see { and [I 
below). A count advances that many sentences (4.2, 6.8). 


Advances to the beginning of a sentence. A count repeats the effect. See ( 
above for the definition of a sentence (4.2, 6.8). 


Unused. 
Same as CR when used as a command. 


Reverse of the last f F t or T command, looking the other way in the current 
line. Especially useful after hitting too many ; characters. A count repeats the 
search. 


Retreats to the previous line at the first non-white character. This is the 
inverse of + and RETURN. If the line moved to is not om the screen, the 
screen is scrolled, or cleared and redrawn if this is not possible. If a large 
amount of scrolling would be required the screen is also cleared and redrawn, 
with the current line at the center (2.3). 


Repeats the last command which changed the buffer. Especially useful when 
deleting words or lines; you can delete some words/lines and then hit . to 
delete more and more words/lines. Given a count, it passes it on to the com- 
mand being repeated. Thus after a 2dw, 3. deletes three words (3.3, 6.3, 7.2, 
7.4). 
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/ 


Reads a string from the last line on the screen, and scans forward for the next 
occurrence of this string. The normal input editing sequences may be used 
during the input on the bottom line; an returns to command state without ever 
searching. The search begins when you hit CR to terminate the pattern; the 
cursor moves to the beginning of the last line to indicate that the search is in 
progress; the search may then be terminated with a DEL or RUB, or by back- 
spacing when at the beginning of the bottom line, returning the cursor to its 
initial position. Searches normally wrap end-around to find a string anywhere 
in the buffer. 


When used with an operator the enclosed region is normally affected. By men- 
tioning an offset from the line matched by the pattern you can force whole 
lines to be affected. To do this give a pattern with a closing a closing / and 
then an offset +” or —7. 


To include the character / in the search string, you must escape it with a 
preceding \. A [| at the beginning of the pattern forces the match to occur at 
the beginning of a line only; this speeds the search. A § at the end of the pat- 
tern forces the match to occur at the end of a line only. More extended pat- 
tern matching is available, see section 7.4; unless you set nomagic in your 
.exrc file you will have to preceed the characters . [ * and ~ in the search pat- 
tern with a \ to get them to work as you would naively expect (1.5, 2,2, 6.1, 
7.2, 7.4). 


Moves to the first character on the current line. Also used, in forming 
numbers, after an initial 1-9. 


Used to form numeric arguments to commands (2.3, 7.2). 


A prefix to a set of commands for file and option manipulation and escapes to 
the system. Input is given on the bottom line and terminated with an CR, and 
the command then executed. You can return to where you were by hitting 
DEL or RUB if you hit : accidentally (see primarily 6.2 and 7.3). 


Repeats the last single character find which used f F t or T. A count iterates 
the basic scan (4.1). 


An operator which shifts lines left one shiftwidth, normally 8 spaces. Like all 
operators, affects lines when repeated, as in <<. Counts are passed through 
to the basic object, thus 3< < shifts three lines (6.6, 7.2). 


Reindents line for LISP, as though they were typed in with lisp and autoindent 
set (6.8). 


An operator which shifts linés right one shiftwidth, normally 8 spaces. Affects 
lines when repeated as in >>. Counts repeat the basic object (6.6, 7.2). 


Scans backwards, the opposite of /. See the / description above for details on 
scanning (2.2, 6.1, 7.4). 


A macro character (6.9). If this is your kill character, you must escape it with 
a \ to type it in during input mode, as it normally backs over the input you 
have given on the current line (3.1, 3.4, 7.5). 


Appends at the end of line, a synonym for $a (7.2). 


Backs up a word, where words are composed of non-blank sequences, placing 
the cursor at the beginning of the word. A count repeats the effect (2.4). 


Changes the rest of the text on the current line; a synonym for cS. 
Deletes the rest of the text on the current line; a synonym for d$. 
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Moves forward to the end of a word, defined as blanks and non-blanks, like B 
and W. A count repeats the effect. 


Finds a single following character, backwards in the current line. A count 
repeats this search that many times (4.1). 


Goes to the line number given as preceding argument, or the end of the file if 
no preceding count is given. The screen is redrawn with the new current line 
in the center if necessary (7.2). 


Home arrow. Homes the cursor to the top line on the screen. If a count is 
given, then the cursor is moved to the count’th line on the screen. In any case 
the cursor is moved to the first non-white character on the line. If used as the 
target of an operator, full lines are affected (2.3, 3.2). 


Inserts at the beginning of a line; a synonym for fi. 


Joins together lines, supplying appropriate white space: one space between 
words, two spaces after a ., and mo spaces at all if the first character of the 
joined on line is ). A count causes that many lines to be joined rather than the 
default two (6.5, 7.1f). 


Unused. 


Moves the cursor to the first non-white character of the last line on the screen. 
With a count, to the first non-white of the count’th line from the bottom. 
Operators affect whole lines when used with L (2.3). 


Moves the cursor to the middle line on the screen, at the first non-white posi- 
tion on the line (2.3). 


Scans for the next match of the last pattern given to / or ?, but in the reverse 
direction; this is the reverse of n. 


Opens a new line above the current line and inputs text there up to an ESC. A 
count can be used on dumb terminals to specify a number of lines to be 
opened; this is generally obsolete, as the slowopen option works better (3.1). 


Puts the last deleted text back before/above the cursor. The text goes back as 
whole lines above the cursor if it was deleted as whole lines. Otherwise the 
text is inserted between the characters before and at the cursor. May be pre- 
ceded by a named buffer specification "x to retrieve the contents of the buffer: 
buffers 1—9 contain deleted material, buffers a—z are available for general use 
(6.3). 

Quits from wi to ex command mode. In this mode, whole lines form com- 
mands, ending with a RETURN. You can give all the : commands; the editor 
supplies the : as a prompt (7.7). 

Replaces characters on the screen with characters you type (overlay fashion). 
Terminates with an ESC. 

Changes whole lines, a synonym for ce. A count substitutes for that many 
lines. The lines are saved in the numeric buffers, and erased on the screen 
before the substitution begins. 

Takes a single following character, locates the character before the cursor in 
the current line, and places the cursor just after that character. A count 
repeats the effect. Most useful with operators such as d (4.1). 

Restores the current line to its state before you started changing it (3.5). 
Unused. 
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W 


ZZ 


Moves forward to the beginning of a word in the current,line, where words are 
defined as sequences of blank/non-blank characters. A count repeats the effect 
(2.4). 


Deletes the character before the cursor. A count repeats the effect, but only 
characters on the current line are deleted. 


Yanks a copy of the current line into the unnamed buffer, to be put back by a 
later p or P; a very useful synonym for yy. A count yanks that many lines. 
May be preceded by a buffer name to put lines in that buffer (7.4). 


Exits the editor. (Same as :xcrR.) If any changes have been made, the buffer is 
written out to the current file. Then the editor quits. 


Backs up to the previous section boundary. A section begins at each macro in 
the sections option, normally a ‘.NH’ or ‘.SH’ and also at lines which which 
start with a formfeed “L. Lines beginning with { also stop [f{; this makes it 
useful for looking backwards, a function at a time, in C programs. If the 
option lisp is set, stops at each ( at the beginning of a line, and is thus useful 
for moving backwards at the top level LISP objects. (4.2, 6.1, 6.6, 7.2). 


Unused. 

Forward to a section boundary, see [I for a definition (4.2, 6.1, 6.6, 7.2). 
Moves to the first non-white position on the current line (4.4). 

Unused. 


When followed by a ‘ returns to the previous context. The previous context is 
set whenever the current line is moved in a non-relative way. When followed 
by a letter az, returns to the position which was marked with this letter with 
am command. When used with an operator such as d, the operation takes 
place from the exact marked place to the current position within the line; if 
you use ’, the operation takes place over complete lines (2.2, 5.3). 


Appends arbitrary text after the current cursor position; the insert can continue 
onto multiple lines by using RETURN within the insert. A count causes the 
inserted text to be replicated, but only if the inserted text is all on one line. 
The insertion terminates with an ESC (3.1, 7.2). 


Backs up to the beginning of a word in the current line. A-word is a sequence 
of alphanumerics, or a sequence of special characters. A count repeats the 
effect (2.4). 


An operator which changes the following object, replacing it with the following 
input text up to an ESC. If more than part of a single line is affected, the text 
which is changed away is saved in the numeric named buffers. If only part of 
the current line is affected, then the last character to be changed away is. 
marked with a $. A count causes that many objects to be affected, thus both 
3c) and ¢3) change the following three sentences (7.4). ° 


An operator which deletes the following object. If more than part of a line is 
affected, the text is saved in the numeric buffers. A count causes that many 
objects to be affected; thus 3dw is the same as d3w (3.3, 3.4, 4.1, 7.4). 


Advances to the end of the next word, defined as for b and w. A count 
repeats the effect (2.4, 3.1). 


Finds the first instance of the next character following the cursor on the 
current line. A count repeats the find (4.1). 


Unused. 


Arrow keys h, j, k, 1, and H. 


—" m2 @U © 3 
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Left arrow. Moves the cursor one character to the left. Like the other arrow 
keys, either h, the left arrow key, or one of the synonyms (“H) has the same 
effect. On v2 editors, arrow keys on certain kinds of terminals (those which 
send escape sequences, such as vt52, cl100, or hp) cannot be used. A count 
repeats the effect (3.1, 7.5). 


Inserts text before the cursor, otherwise like a (7.2). 
Down arrow. Moves the cursor one line down in the same column. If the 


position does not exist, vi comes as close as possible to the same column. 


Synonyms include “J (linefeed) and “N. 
Up arrow. Moves the cursor one line up. “P is a synonym. 


_ Right arrow. Moves the cursor one character to the right. SPACE is a 
‘synonym. 


Marks the current position of the cursor in the mark register which is specified 
by the next character a—z. Return to this position or use with an operator 
using * or ’ (5.3). 


Repeats the last / or ? scanning commands (2.2). 
Opens new lines below the current line; otherwise like O (3.1). 
Puts text after/below the cursor; otherwise like P (6.3). 


~ Unused. 


+. 


Replaces the single character at the cursor with a single character you type. 
The new character may be a RETURN; this is the easiest way to split lines. A 
count replaces each of the following count characters with the single character 


_given; see R above which is the more usually useful iteration of r (3.2). 


Changes the single character under the cursor to the text which follows up to 
an ESC; given a count, that many characters from the current line are changed. 
The last character to be changed is marked with $ as in ¢ (3.2). 


Advances the cursor upto the character before the next character typed. Most 
useful with operators such as d and ¢ to delete the characters up to a following 
character. You can use . to delete more if this doesn’t delete enough the first 
time (4.1). 


Undoes the last change made to the current buffer. If repeated, will alternate 


between these two states, thus is its own inverse. When used after an insert 


which inserted text on more than one line, the lines are saved in the numeric 
named buffers (3.5). 


Unused. 


_ Advances to the beginning of the next word, as defined by b (2.4). 


Deletes the single character under the cursor. With a count deletes deletes 
that many characters forward from the cursor position, but only on the current 
line (6.5). 


An operator, yanks the following object into the unnamed temporary buffer. If 
preceded by a named buffer specification, "x, the text is placed in that buffer 
also. Text can be recovered by a later p or P (7.4). 


Redraws the screen with the current line placed as specified by the following 
character: RETURN specifies the top of the screen, . the center of the screen, 
and — at the bottom of the screen. A count may be given after the z and 
before the following character to specify the new screen size for the redraw. A 
count before the z gives the number of the line to place in the center of the 
screen instead of the defauit current line. (5.4) 
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“? (DEL) 


Retreats to the beginning of the beginning of the preceding paragraph. A para- 
graph begins at each macro in the paragraphs option, normally ‘.IP’, ‘.LP’, 
‘PP’, ‘.QP’ and ‘.bp’. A paragraph also begins after a completely empty line, 
and at each section boundary (see [[ above) (4.2, 6.8, 7.6). 


Places the cursor on the character in the column specified by the count (7.1, 
7.2). 


Advances to the beginning of the next paragraph. See { for the definition of 
paragraph (4.2, 6.8, 7.6). 


Unused. 
Interrupts the editor, returning it to command accepting state (1.5, 7.5) 
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1. Starting ex 


Each instance of the editor has a set of options, which can be set to tailor it to your lik- 
ing. The command edit invokes a version of ex designed for more casual or beginning users 
by changing the default settings of some of these options. To simplify the description which 
follows we assume the default settings of the options. 


When invoked, ex determines the terminal type from the TERM variable in the environ- 
ment. It there is a TERMCAP variable in the environment, and the type of the terminal 
described there matches the TERM variable, then that description is used. Also if the 
TERMCAP variable contains a pathname (beginning with a /) then the editor will seek the 
description of the terminal in that file (rather than the default /etc/termcap.) If there is a 
variable EXINIT in the environment, then the editor will execute the commands in that vari- 
able, otherwise if there is a file .exrc in your HOME directory ex reads commands from that 
file, simulating a source command. Option setting commands placed in EXINIT or .exrc will 
be executed before each editor session. 


A command to enter ex has the following prototype:t 
ex [—][—v][-—-ttag][-r][-—-l][—wn][-—-x][—R][+command ] name ... 
The most common case edits a single file with no options, i.e.: 
ex name 


‘The — command line option option suppresses all interactive-user feedback and is useful in 
processing editor scripts in command files. The —v option is equivalent to using vi rather 
than ex. The —t option is equivalent to an initial tag command, editing the file containing the 
tag and positioning the editor at its definition. The —r option is used in recovering after an 
editor or system crash, retrieving the last saved version of the named file or, if no file is 
specified, typing a list of saved files. The —I option sets up for editing LISP, setting the 
showmatch and lisp options. The —w option sets the default window size to n, and is useful 
on dialups to start in small windows. The —x option causes ex to prompt for a key, which is 
used to encrypt and decrypt the contents of the file, which should already be encrypted using 


The financial support of an IBM Graduate Fellowship and the National Science Foundation under grants 
MCS74-07644-A03 and MCS78-07291 is gratefully acknowledged. 
+ Brackets ‘[’ ‘]’ surround optional parameters here. 
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the same key, see crypt(1). The —R option sets the readonly option at the start. £ Name 
arguments indicate files to be edited. An argument of the form +command indicates that the 
editor should begin by executing the specified command. If command is omitted, then it 
defaults to “$”, positioning the editor at the last line of the first file initially. Other useful 
commands here are scanning patterns of the form “/pat” or line numbers, e.g. “+100” starting 
at line 100. 


2. File manipulation 


2.1. Current file 


Ex is normally editing the contents of a single file, whose name is recorded in the 
current file name. Ex performs all editing actions in a buffer (actually a temporary file) into 
which the text of the file is initially read. Changes made to the buffer have no effect on the 
file being edited unless and until the buffer contents are written out to the file with a write 
command. After the buffer contents are written, the previous contents of the written file are 
no longer accessible. When a file is edited, its name becomes the current file name, and its 
contents are read into the buffer. 


The current file is almost always considered to be edited. This means that the contents 
of the buffer are logically connected with the current file name, so that writing the current 
buffer contents onto that file, even if it exists, is a reasonable action. If the current file is not 
edited then ex will not normally write on it if it already exists.* 


| 2.2. Alternate file 


Each time a new value is given to the current file name, the previous current file name is 
saved as the alternate file name. Similarly if a file is mentioned but does not become the 
current file, it is saved as the alternate file name. 


2.3. Filename expansion 


Filenames within the editor may be specified using the normal shell expansion conven- 
tions. In addition, the character ‘%’ in filenames is replaced by the current file name and the 
character ‘#’ by the alternate file name.t 


2.4. Multiple files and named buffers 


If more than one file is given on the command line, then the first file is edited as 
described above. The remaining arguments are placed with the first file in the argument list. 
The current argument list may be displayed with the args command. The next file in the 
argument list may be edited with the next command. The argument list may also be 
respecified by specifying a list of names to the next command. These names are expanded, 
the resulting list of names becomes the new argument list, and ex edits the first file on the 
list. 


For saving blocks of text while editing, and especially when editing more than one file, 
ex has a group of named buffers. These are similar to the normal buffer, except that only a 
limited number of operations are available on them. The buffers have names a through z. ¢ 


~ Not available in all v2 editors due to memory constraints. 

* The file command will say “[{Not edited]” if the current file is not considered edited. 

+ This makes it easy to deal alternately with two files and eliminates the need for retyping the name sup- 
plied on an edit command after a No write since last change diagnostic is received. 

¢ It is also possible to refer to A through Z; the upper case buffers are the same as the lower but commands 
append to named buffers rather than replacing if upper case names are used. 
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2.5. Read only 


It is possible to use ex in read only mode to look at files that you have no intention of 
modifying. This mode protects you from accidently overwriting the file. Read only mode is 
on when the readonly option is set. It can be turned on with the —R command line option, 
by the view command line invocation, or by setting the readonly option. It can be cleared by 
setting noreadonly. It is possible to write, even while in read only mode, by indicating that 
you really know what you are doing. You can write to a different file, or can use the ! form of 
write, even while in read only mode. 


3. Exceptional Conditions 


3.1. Errors and interrupts 


When errors occur ex (optionally) rings the terminal bell and, in any case, prints an 
error diagnostic. If the primary input is from a file, editor processing will terminate. If an 
interrupt signal is received, ex prints “Interrupt” and returns to its command level. If the 
primary input is a file, then ex will exit when this occurs. 


3.2. Recovering from hangups and crashes 


If a hangup signal is received and the buffer has been modified since it was last written 
out, or if the system crashes, either the editor (in the first case) or the system (after it reboots 
in the second) will attempt to preserve the buffer. The next time you log in you should be 
able to recover the work you were doing, losing at most a few lines of changes from the last 
point before the hangup or editor crash. To recover a file you can use the —r option. If you 
were editing the file resume, then you should change to the directory where you were when. 
the crash occurred, giving the command 


ex —Fr resume 


After checking that the retrieved file is indeed ok, you can write it over the previous contents 
of that file. 


You will normally get mail from the system telling you when a file has been saved after a 
crash. The command 


ex —fr 


will print a list of the files which have been saved for you. (In the case of a hangup, the file 
will not appear in the list, although it can be recovered.) 


4, Editing modes 


Ex has five distinct modes. The primary mode is command mode. Commands are 
entered in command mode when a ‘:’ prompt is present, and are executed each time a com- 
plete line is sent. In text input mode ex gathers input lines and places them in the file. The 
append, insert, and change commands use text input mode. No prompt is printed when you 
are in text input mode. This mode is left by typing a ‘.’ alone at the beginning of a line, and 
command mode resumes. 


The last three modes are open and visual modes, entered by the commands of the same 
name, and, within open and visual modes text insertion mode. Open and visual modes allow 
local editing operations to be performed on the text in the file. The open command displays 
one line at a time on any terminal while visual works on CRT terminals with random position- 
ing cursors, using the screen as a (single) window for file editing changes. These modes are 
described (only) in An Introduction to Display Editing with Vi. 


3-86 Ex Reference Manual 


5. Command structure 


Most command names are English words, and initial prefixes of the words are acceptable 
abbreviations. The ambiguity of abbreviations is resolved in favor of the more commonly used 
commands.* 


5.1. Command parameters 


Most commands accept prefix addresses specifying the lines in the file upon which they 
are to have effect. The forms of these addresses will be discussed below. A number of com- 
mands also may take a trailing count specifying the number of lines to be involved in the 
command.} Thus the command “10p” will print the tenth line in the buffer while “delete 5” 
will delete five lines from the buffer, starting with the current line. 


Some commands take other information or parameters, this information always being 
given after the command name.t 


5.2. Command variants 


A number of commands have two distinct variants. The variant form of the command is 
invoked by placing an ‘!’ immediately after the command name. Some of the default variants 
may be controlled by options; in this case, the ‘!’ serves to toggle the default. 


5.3. Flags after commands 


The characters ‘#’, ‘p’ and ‘Il’ may be placed after many commands.** In this case, the 
command abbreviated by these characters is executed after the command completes. Since ex 
normally prints the new current line after each change, ‘p’ is rarely necessary. Any number of 
‘+’ or ‘—’ characters may also be given with these flags. If they appear, the specified offset is 
applied to the current line value before the printing command is executed. 


5.4. Comments 


It is possible to give editor commands which are ignored. This is useful when making 
complex editor scripts for which comments are desired. The comment character is the double 
quote: ”. Any command line beginning with ” is ignored. Comments beginning with ” may 
also be placed at the ends of commands, except in cases where they could be confused as part 
of text (shell escapes and the substitute and map commands). 


5.5. Multiple commands per line 


More than one command may be placed on a line by separating each pair of commands 
by a ’ character. However the global commands, comments, and the shell escape ‘!’ must be 
the last command on a line, as they are not terminated by a t. 


5.6. Reporting large changes 


Most commands which change the contents of the editor buffer give feedback if the 
scope of the change exceeds a threshold given by the report option. This feedback helps to 
detect undesirably large changes so that they may be quickly and easily reversed with an 
undo. After commands with more global effect such as global or visual, you will be informed 
if the net change in the number of lines in the buffer during this command exceeds this thres- 
hold. 


* As an example, the command substitute can be abbreviated ‘s’ while the shortest available abbreviation 
for the set command is ‘se’. 
+ Counts are rounded down if necessary. 

~ Examples would be option names in a set command i.e. “set number”, a file name in an edit command, a 
regular expression in a substitute command, or a target address for a copy command, i.e. “1,5 copy 25”. 

** A ‘py’ or ‘I’ must be preceded by a blank or tab except in the single special case ‘dp’. 
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6. Command addressing 


6.1. Addressing primitives 


The current line. Most commands leave the current line as the last line 
which they affect. The default address for most commands is the 
current line, thus ‘.’ is rarely used alone as an address. 


n The nth line in the editor’s buffer, lines being numbered sequentially 
from 1. 

$ The last line in the buffer. 

% An abbreviation for “1,$”, the entire buffer. 

+n —n An offset relative to the current buffer line. 

/pat/ ?pat? Scan forward and backward respectively for a line containing pat, a reg- 


ular expression (as defined below). The scans normally wrap around the 
end of the buffer. If all that is desired is to print the next line contain- 
ing pat, then the trailing / or ? may be omitted. If pat is omitted or 
explicitly empty, then the last regular expression specified is located.{ 


ane $3 


x Before each non-relative motion of the current line ‘.’, the previous 
current line is marked with a tag, subsequently referred to as “”’. This 
makes it easy to refer or return to this previous context. Marks may 
also be established by the mark command, using single lower case 
letters x and the marked lines referred to as ‘x’. 


6.2. Combining addressing primitives 


Addresses to commands consist of a series of addressing primitives, separated by ‘,’ or ‘;’. 
Such address lists are evaluated left-to-right. When addresses are separated by ‘;’ the current 
line ‘.’ is set to the value of the previous addressing expression before the next address is 
interpreted. If more addresses are given than the command requires, then all but the last one 
or two are ignored. If the command takes two addresses, the first addressed line must precede 
the second in the buffer.* 


7. Command descriptions 
The following form is a prototype for all ex commands: 


address command ! parameters count flags 


All parts are optional; the degenerate case is the empty command which prints the next line in 
the file. For sanity with use from within visual mode, ex ignores a “:” preceding any com- 
mand. 


In the following command descriptions, the default addresses are shown in parentheses, 
which are not, however, part of the command. 


abbreviate word rhs abbr: ab 


Add the named abbreviation to the current list. When in input mode in visual, if word 
is typed as a complete word, it will be changed to rhs. 


+ The forms ‘.+3’ ‘+3’ and ‘+++’ are all equivalent; if the current line is line 100 they all address line 103. 

+ The form\/ and scan using the last regular expression used in a scan; after a substitute // and ?? would 
scan using the substitute’s regular expression. 

+ Null address specifications are permitted in a list of addresses, the default in this case is the current line 
‘5 thus ‘,100’ is equivalent to ‘.,100’. It is an error to give a prefix address to a command which expects 
none. 
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(.) append abbr: a 
text 


Reads the input text and places it after the specified line. After the command, ‘.’ 
addresses the last line input or the specified line if no lines were input. If address ‘0’ is 
given, text is placed at the beginning of the buffer. 


a! 

text 
The variant flag to append toggles the setting for the autoindent option during the 
input of text. 

args 
The members of the argument list are printed, with the current argument delimited by 
i and y, 

(.,.) change count abbr: ¢ 

text 
Replaces the specified lines with the input text. The current line becomes the last line 
input; if no lines were input it is left as for a delete. 

ce! 

text 
The variant toggles autoindent during the change. 

(.,.)eopy addr flags abbr: co 
A copy of the specified lines is placed after addr, which may be ‘0’. The current line ‘.’ 
addresses the last line of the copy. The command ¢ is a synonym for copy. 

(.,-)delete buffer count flags abbr: d 
Removes the specified lines from the buffer. The line after the last line deleted becomes 
the current line; if the lines deleted were originally at the end, the new last line becomes 
the current line. If a named buffer is specified by giving a letter, then the specified lines 
are saved in that buffer, or appended to it if an upper case letter is used. 

edit file abbr: e 

ex file 


Used to begin an editing session on a new file. The editor first checks to see if the buffer 
has been modified since the last write command was issued. If it has been, a warning is 
issued and the command is aborted. The command otherwise deletes the entire contents 
of the editor buffer, makes the named file the current file and prints the new filename. 
After insuring that this file is sensiblet the editor reads the file into its buffer. 


If the read of the file completes without error, the number of lines and characters read is 
typed. If there were any non-ASCII characters in the file they are stripped of their non- 
ASCII high bits, and any null characters in the file are discarded. If none of these errors 
occurred, the file is considered edited. If the last line of the input file is missing the 


+ Le., that it is not a binary file such as a directory, a block or character special file other than /deu/tty, a 
terminal, or a binary or executable file (as indicated by the first word). 
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trailing newline character, it will be supplied and a complaint will be issued. This com- 
mand leaves the current line ‘.’ at the last line read. 


e! file 


The variant form suppresses the complaint about modifications having been made and 
not written from the editor buffer, thus discarding all changes which have been made 
before editing the new file. 


e +n file 


Causes the editor to begin at line n rather than at the last line; n may,also be an editor 
command containing no spaces, e.g.: “+/pat”. 


file abbr: f 


Prints the current file name, whether it has been ‘[Modified]’ since the last write com- 
mand, whether it is read only, the current line, the number of lines in the buffer, and 
the percentage of the way through the buffer of the current line.* 


file file 


The current file name is changed to file which is considered ‘[Not edited]’. 


(1,$) global /pat/ cmds abbr: g 


First marks each line among those specified which matches the given regular expression. 
Then the given command list is executed with ‘.’ initially set to each marked line. 


The command list consists of the remaining commands on the current input line and 
may continue to multiple lines by ending all but the last such line with a ‘\. If cmds 
(and possibly the trailing / delimiter) is omitted, each line matching pat is printed. 
Append, insert, and change commands and associated input are permitted; the ‘.’ ter- 
minating input may be omitted if it would be on the last line of the command list. 
Open and visual commands are permitted in the command list and take input from the 
terminal. 


The global command itself may not appear in cmds. The undo command is also not per- 
mitted there, as undo instead can be used to reverse the entire global command. The 
options autoprint and autoindent are inhibited during a global, (and possibly the trail- 
ing / delimiter) and the value of the report option is temporarily infinite, in deference to 
a report for the entire global. Finally, the context mark “” is set to the value of ‘.’ 
before the global command begins and is not changed during a global command, except 
perhaps by an open or visual within the global. 


g! /pat/ cmds abbr: v 

The variant form of global runs cmds at each line not matching pat. 
(. )insert abbr: i 
text 


Places the given text before the specified line. The current line is left at the last line 
input; if there were none input it is left at the line before the addressed line. This com- 
mand differs from append only in the placement of text. 


£ If executed from within open or visual, the current line is initially the first line of the file. 

* In the rare case that the current file is ‘[Not edited]’ this is noted also; in this case you have to use the 
form w! to write to the file, since the editor is not sure that a write will not destroy a file unrelated to the 
current contents of the buffer. 
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i! 
text 


The variant toggles autoindent during the insert. 


(.,.+1 ) join count flags abbr: j 


Places the text from a specified range of lines together on one line. White space is 
adjusted at each junction to provide at least one blank character, two if there was a ‘.’ at 
the end of the line, or none if the first following character is a ‘)’. If there is already 
white space at the end of the line, then the white space at the start of the next line will 
be discarded. 


j! 
The variant causes a simpler join with no white space processing; the characters in the 
lines are simply concatenated. 

(.) kx 


The k command is a synonym for mark. It does not require a blank or tab before the 
following letter. 


(.,.) list count flags 


Prints the specified lines in a more unambiguous way: tabs are printed as “l’ and the 
end of each line is marked with a trailing ‘$’. The current line is left at the last line 
printed. 


map lhs rhs 


The map command is used to define macros for use in visual mode. Lhs should be a 
single character, or the sequence “#{n”, for n a digit, referring to function key n. When 
this character or function key is typed in visual mode, it will be as though the 
corresponding rhs had been typed. On terminals without function keys, you can type 
“ttn”, See section 6.9 of the “Introduction to Display Editing with Vi” for more details. 


(.) mark x 


Gives the specified line mark x, a single lower case letter. The x must be preceded by a 
blank or a tab. The addressing form “x’ then addresses this line. The current line is not 
affected by this command. 


(.,-) move addr abbr: m 


The move command repositions the specified lines to be after addr. The first of the 
moved lines becomes the current line. 


next abbr: n 
The next file from the command line argument list is edited. 


n! 


The variant suppresses warnings about the modifications to the buffer not having been 
written out, discarding (irretrievably) any changes which may have been made. 
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n filelist 
n +command filelist 


The specified filelist is expanded and the resulting list replaces the current argument 
list; the first file in the new list is then edited. If command is given (it must contain no 
spaces), then it is executed after editing the first such file. 


(.,-) number count flags abbr: # or nu 


Prints each specified line preceded by its buffer line number. The current line is left at 
the last line printed. 


(.) open flags abbr: o 

(.) open /pat/ flags 
Enters intraline editing open mode at each addressed line. If pat is given, then the cur- 
sor will be placed initially at the beginning of the string matched by the pattern. To exit 
this mode use Q. See An Introduction to Display Editing with Vi for more details. 


t 


preserve 


The current editor buffer is saved as though the system had just crashed. This com- 
mand is for use only in emergencies when a write command has resulted in an error and 
you don’t know how to save your work. After a preserve you should seek help. 


(.,.) print count abbr: p or P 


Prints the specified lines with non-printing characters printed as control characters “x’; 
delete (octal 177) is represented as “*?’. The current line is left at the last line printed. 


(.)put buffer abbr: pu 


Puts back previously deleted or yanked lines. Normally used with delete to effect 
movement of lines, or with yank to effect duplication of lines. If no buffer is specified, 
then the last deleted or yanked text is restored.* By using a named buffer, text may be 
restored that was saved there at any previous time. 


quit abbr: q 


Causes ex to terminate. No automatic write of the editor buffer to a file is performed. 
However, ex issues a warning message if the file has changed since the last write com- 
mand was issued, and does not quit.# Normally, you will wish to save your changes, and 
you should give a write command; if you wish to discard them, use the q! command vari- 


ant. 
q! 

Quits from the editor, discarding changes to the buffer without complaint. 
(.) read file abbr: r 


Places a copy of the text of the given file in the editing buffer after the specified line. If 
no file is given the current file name is used. The current file name is not changed 
unless there is none in which case file becomes the current name. The sensibility res- 
trictions for the edit command apply here also. If the file buffer is empty and there is 
no current name then ex treats this as an edit command. 


+ Not available in all v2 editors due to memory constraints. 

* But no modifying commands may intervene between the delete or yank and the put, nor may lines be 
moved between files without using a named buffer. 

+ Ex will also issue a diagnostic if there are more files in the argument list. 
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Address ‘0’ is legal for this command and causes the file to be read at the beginning of 
the buffer. Statistics are given as for the edit command when the read successfully ter- 
minates. After a read the current line is the last line read.t 


(.) read !command 


Reads the output of the command command into the buffer after the specified line. 
This is not a variant form of the command, rather a read specifying a command rather 
than a filename; a blank or tab before the ! is mandatory. 


recover file 


Recovers file from the system save area. Used after a accidental hangup of the phone** 
or a system crash** or preserve command. Except when you use preserve you will be 
notified by mail when a file is saved. 


rewind abbr: rew 
The argument list is rewound, and the first file in the list is edited. 


rew! 


Rewinds the argument list discarding any changes made to the current buffer. 


set parameter 


With no arguments, prints those options whose values have been changed from theit 
defaults; with parameter all it prints all of the option values. 


Giving an option name followed by a ‘?’ causes the current value of that option to be 
printed. The ‘?’ is unnecessary unless the option is Boolean valued. Boolean options are 
given values either by the form ‘set option’ to turn them on or ‘set nooption’ to turn 
them off; string and numeric options are assigned via the form ‘set option=value’. 


More than one parameter may be given to set ; they are interpreted left-to-right. 


shell abbr: sh 
A new shell is created. When it terminates, editing resumes. 


source file abbr: so 
Reads and executes commands from the specified file. Source commands may be nested. 


(.,.) substitute /pat/repl/ options count flags abbr: s 


On each specified line, the first instance of pattern pat is replaced by replacement pat- 
tern repl. If the global indicator option character ‘g’ appears, then all instances are sub- 
stituted; if the confirm indication character ‘c’ appears, then before each substitution the 
line to be substituted is typed with the string to be substituted marked with ‘}}’ charac- 
ters. By typing an ‘y’ one can cause the substitution to be performed, any other input 
causes no change to take place. After a substitute the current line is the last line substi- 
tuted. 


Lines may be split by substituting new-line characters into them. The newline in repl 
must be escaped by preceding it with a ‘\. Other metacharacters available in pat and 
repl are described below. 


~ Within open and visual the current line is set to the first line read rather than the last. 
** The system saves a copy of the file you were editing only if you have made changes to the file. 
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stop 


Suspends the editor, returning control to the top level shell. If autowrite is set and 
there are unsaved changes, a write is done first unless the form stop! is used. This 
commands is only available where supported by the teletype driver and operating sys- 
tem. 


(.,.) substitute options count flags abbr: s 


If pat and repl are omitted, then the last substitution is repeated. This is a synonym 
for the & command. 


(.,.) t addr flags 
The ¢ command is a synonym for copy. 


ta tag 


The focus of editing switches to the location of tag, switching to a different line in the 
current file where it is defined, or if necessary to another file.t 


The tags file is normally created by a program such as ctags, and consists of a number of 
lines with three fields separated by blanks or tabs. The first field gives the name of the 
tag, the second the name of the file where the tag resides, and the third gives an address- 
ing form which can be used by the editor to find the tag; this field is usually a contextual 
scan using ‘/pat/’ to be immune to minor changes in the file. Such scans are always per- 
formed as if nomagic was set. 


The tag names in the tags file must be sorted alphabetically. ¢ 


unabbreviate word abbr: una 
Delete word from the list of abbreviations. 


undo abbr: u 


Reverses the changes made in the buffer by the last buffer editing command. Note that 
global commands are considered a single command for the purpose of undo (as are open 
and visual.) Also, the commands write and edit which interact with the file system can- 
not be undone. Undo is its own inverse. 


Undo always marks the previous value of the current line ‘.’ as “’. After an undo the 
current line is the first line restored or the line before the first line deleted if no lines 
were restored. For commands with more global effect such as global and visual the 
current line regains it’s pre-command value after an undo. 


unmap lhs 


The macro expansion associated by map for lhs is removed. 


(1,$) v /pat/ cmds 


A synonym for the global command variant g!, running the specified cmds on each line 
which does not match pat. 


t If you have modified the current file before giving a tag command, you must write it out; giving another 
tag command, specifying no tag will reuse the previous tag. 
+ Not available in all v2 editors due to memory constraints. 
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version abbr: ve 
Prints the current version number of the editor as well as the date the editor was last 
changed. 

(.) visual type count flags abbr: vi 


? 


Enters visual mode at the specified line. Type is optional and may be ‘—’ , ‘#’ or ‘.’ as in 
the z command to specify the placement of the specified line on the screen. By default, 
if type is omitted, the specified line is placed as the first on the screen. A count 
specifies an initial window size; the default is the value of the option window. See the 
document An Introduction to Display Editing with Vi for more details. To exit this 
mode, type Q. 


visual file 
visual +n file 


From visual mode, this command is the same as edit. 


(1,$) write file abbr: w 


Writes changes made back to file, printing the number of lines and characters written. 
Normally file is omitted and the text goes back where it came from. If a file is specified, 
then text will be written to that file.* If the file does not exist it is created. The current 
file name is changed only if there is no current file name; the current line is never 
changed. 


If an error occurs while writing the current and edited file, the editor considers that 
there has been “No write since last change” even if the buffer had not previously been 
modified. 


(1,$) write>> file abbr: w>> 
Writes the buffer contents at the end of an existing file. 


w! name 


Overrides the checking of the normal write command, and will write to any file which 
the system permits. 


(1,$)w !command 


Writes the specified lines into command. Note the difference between w! which over- 
rides checks and w ! which writes to a command. 


wq name 


Like a write and then a quit command. 


wq! name 
The variant overrides checking on the sensibility of the write command, as w! does. 


xit name 


If any changes have been made and not written, writes the buffer out. Then, in any 
case, quits. 


* The editor writes to a file only if it is the current file and is edited, if the file does not exist, or if the file 
is actually a teletype, /dev/tty, /dev/null. Otherwise, you must give the variant form w! to force the write. 


Cee 
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)yank buffer count abbr: ya 


Places the specified lines in the named buffer, for later retrieval via put. If no buffer 
name is specified, the lines go to a more volatile place; see the put command description. 


(.+1 ) z count 


Print the next count lines, default window. 


(.) z type count 


Prints a window of text with the specified line at the top. If type is ‘—’ the line is placed 
at the bottom; a ‘.’ causes the line to be placed in the center.* A count gives the number 
of lines to be displayed rather than double the number specified by the scroll option. 
On a CRT the screen is cleared before display begins unless a count which is less than the 
screen size is given. The current line is left at the last line printed. 


! command 


The remainder of the line after the ‘!’ character is sent to a shell to be executed. Within 
the text of command the characters ‘%’ and ‘#’ are expanded as in filenames and the 
character ‘!’ is replaced with the text of the previous command. Thus, in particular, ‘!!’ 
repeats the last such shell escape. If any such expansion is performed, the expanded line 
will be echoed. The current line is unchanged by this command. 


If there has been “{No write]” of the buffer contents since the last change to the editing 
buffer, then a diagnostic will be printed before the command is executed as a warning. 
A single ‘!’ is printed when the command completes. 


( addr , addr ) ! command 


($) 


Takes the specified address range and supplies it as standard input to command; the 
resulting output then replaces the input lines. 


Prints the line number of the addressed line. The current line is unchanged. 


(.,.) > count flags 
_(.,.) < count flags 

Perform intelligent shifting on the specified lines; < shifts left and > shift right. The 
quantity of shift is determined by the shiftwidth option and the repetition of the 
specification character. Only white space (blanks and tabs) is shifted; no non-white 
characters are discarded in a left-shift. The current line becomes the last line which 
changed due to the shifting. 

“‘D 
An end-of-file from a terminal input scrolls through the file. The scroll option specifies 
the size of the scroll, normally a half screen of text. 

(.+1,.4+1) 

(.+1,.4+1)| 


An address alone causes the addressed lines to be printed. A blank line prints the next 
line in the file. 


* Forms ‘z=’ and ‘zt’ also exist; ‘z=’ places the current line in the center, surrounds it with lines of ‘—’ char- 
acters and leaves the current line at this line. The form ‘z{’ prints the window before ‘z—’ would. The char- 
acters ‘+’, ‘{? and ‘—’ may be repeated for cumulative effect. On some v2 editors, no type may be given. 


3-96 Ex Reference Manual 


(.,-) & options count flags 


Repeats the previous substitute command. 


(.,.) options count flags 


Replaces the previous regular expression with the previous replacement pattern from a 
substitution. 


8. Regular expressions and substitute replacement patterns 


8.1. Regular expressions 


A regular expression specifies a set of strings of characters. A member of this set of 
strings is said to be matched by the regular expression. Ex remembers two previous regular 
expressions: the previous regular expression used in a substitute command and the previous 
regular expression used elsewhere (referred to as the previous scanning regular expression.) 
The previous regular expression can always be referred to by a null re, e.g. ‘//’ or ‘??’. 


8.2. Magic and nomagic 


The regular expressions allowed by ex are constructed in one of two ways depending on 
the setting of the magic option. The ex and vi default setting of magic gives quick access to 
a powerful set of regular expression metacharacters. The disadvantage of magic is that the 
user must remember that these metacharacters are magic and precede them with the charac- 
ter ‘N to use them as “ordinary” characters. With nomagic, the default for edit, regular 
expressions are much simpler, there being only two metacharacters. The power of the other 
metacharacters is still available by preceding the (now) ordinary character with a ‘\. Note 
that ‘N is thus always a metacharacter. 


The remainder of the discussion of regular expressions assumes that that the setting of 
this option is magic. 7 


8.3. Basic regular expression summary 
The following basic constructs are used to construct magic mode regular expressions. 


char An ordinary character matches itself. The characters ‘{}’ at the beginning of a 
line, ‘$’ at the end of line, ‘*’ as any character other than the first, ‘.’, ‘N, ‘[’, 
and “”’ are not ordinary characters and must be escaped (preceded) by ‘\ to 
be treated as such. 


fh At the beginning of a pattern forces the match to succeed only at the begin- 
ning of a line. 

$ At the end of a regular expression forces the match to succeed only at the end 
of the line. 


Matches any single character except the new-line character. 


*«K Forces the match to occur only at the beginning of a “variable” or “word”; 
that is, either at the beginning of a line, or just before a letter, digit, or under- 
line and after a character not one of these. 


Ye Similar to ‘y<’, but matching the end of a “variable” or “word”’, i.e. either the 
end of the line or before character which is neither a letter, nor a digit, nor 
the underline character. 


+ To discern what is true with nomagic it suffices to remember that the only special characters in this case 
will be ‘{ at the beginning of a regular expression, ‘$’ at the end of a regular expression, and \, With 
nomagic the characters “’ and ‘&’ also lose their special meanings related to the replacement pattern of a 
substitute. 
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[string] Matches any (single) character in the class defined by string. Most characters 
in string define themselves. A pair of characters separated by ‘—’ in string 
defines the set of characters collating between the specified lower and upper 
bounds, thus ‘[a—z]’ as a regular expression matches any (single) lower-case 
letter. If the first character of string is an ‘{}’ then the construct matches 
those characters which it otherwise would not; thus ‘[{ta—z]’ matches anything 
but a lower-case letter (and of course a newline). To place any of the charac- 
ters ‘tf’, ‘[’, or ‘—’ in string you must escape them with a preceding ‘NV. 


8.4. Combining regular expression primitives 


The concatenation of two regular expressions matches the leftmost and then longest 
string which can be divided with the first piece matching the first regular expression and the 
second piece matching the second. Any of the (single character matching) regular expressions 
mentioned above may be followed by the character ‘*’ to form a regular expression which 
matches any number of adjacent occurrences (including 0) of characters matched by the regu- 
lar expression it follows. 


‘™) 


The character may be used in a regular expression, and matches the text which 
defined the replacement part of the last substitute command. A regular expression may be 
enclosed between the sequences ‘\’ and ‘\’ with side effects in the substitute replacement 
patterns. 


8.5. Substitute replacement patterns 


The basic metacharacters for the replacement pattern are ‘&’ and ‘’; these are given as 
‘“&’ and ‘X’ when nomagic is set. Each instance of ‘&’ is replaced by the characters which the 
regular expression matched. The metacharacter “’ stands, in the replacement pattern, for the 
defining text of the previous replacement pattern. 


Other metasequences possible in the replacement pattern are always introduced by the 
escaping character ‘“\. The sequence ‘w’ is replaced by the text matched by the n-th regular 
subexpression enclosed between ‘¥’ and ‘\’.t The sequences ‘Ww’ and ‘N’ cause the immediately 
following character in the replacement to be converted to upper- or lower-case respectively if 
this character is a letter. The sequences ‘XU’ and ‘NL’ turn such conversion on, either until ‘WH’ 
or “e’ is encountered, or until the end of the replacement pattern. 


9. Option descriptions 


autoindent, ai default: noai 


Can be used to ease the preparation of structured program text. At the beginning of 
each append, change or insert command or when a new line is opened or created by an 
append, change, insert, or substitute operation within open or visual mode, ex looks at 
the line being appended after, the first line changed or the line inserted before and cal- 
culates the amount of white space at the start of the line. It then aligns the cursor at 
the level of indentation so determined. 


If the user then types lines of text in, they will continue to be justified at the displayed 
indenting level. If more white space is typed at the beginning of a line, the following 
line will start aligned with the first non-white character of the previous line. To back 
the cursor up to the preceding tab stop one can hit ~D. The tab stops going backwards 
are defined at multiples of the shiftwidth option. You cannot backspace over the 
indent, except by sending an end-of-file with a “D. 


+ When nested, parenthesized subexpressions are present, n is determined by counting occurrences of \ 
starting from the left. 
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Specially processed in this mode is a line with no characters added to it, which turns 
into a completely blank line (the white space provided for the autoindent is discarded.) 
Also specially processed in this mode are lines beginning with an ‘{}’ and immediately fol- 
lowed by a “D. This causes the input to be repositioned at the beginning of the line, but 
retaining the previous indent for the next line. Similarly, a ‘0’ followed by a “D reposi- 
tions at the beginning but without retaining the previous indent. 


Autoindent doesn’t happen in global commands or when the input is not a terminal. 


autoprint, ap default: ap 


Causes the current line to be printed after each delete, copy, join, move, substitute, t, 
undo or shift command. This has the same effect as supplying a trailing ‘p’ to each such 
command. Autoprint is suppressed in globals, and only applies to the last of many com- 
mands on a line. 


autowrite, aw default: noaw 


Causes the contents of the buffer to be written to the current file if you have modified it 
and give a next, rewind, stop, tag, or ! command, or a “{ (switch files) or *] (tag goto) 
command in visual. Note, that the edit and ex commands do not autowrite. In each 
case, there is an equivalent way of switching when autowrite is set to avoid the 
autowrite (edit for next, rewind! for .I rewind , stop! for stop, tag! for tag, shell for !, 
and :e # and a :ta! command from within visual). 


beautify, bf default: nobeautify 


Causes all control characters except tab, newline and form-feed to be discarded from the 
input. A complaint is registered the first time a backspace character is discarded. Beau- 
tify does not apply to command input. 


directory, dir default: dir=/tmp 


Specifies the directory in which ex places its buffer file. If this directory in not writable, 
then the editor will exit abruptly when it fails to be able to create its buffer there. 


edcompatible default: noedcompatible 


Causes the presence of absence of g and ¢ suffixes on substitute commands to be remem- 
bered, and to be toggled by repeating the suffices. The suffix r makes the substitution 
be as in the ~ command, instead of like & £¢ 


errorbells, eb default: noeb 


Error messages are preceded by a bell.* If possible the editor always places the error 
message in a standout mode of the terminal (such as inverse video) instead of ringing the 
bell. 


hardtabs, ht default: ht=8 


Gives the boundaries on which terminal hardware tabs are set (or on which the system 
expands tabs). 


ignorecase, ic default: noic 


All upper case characters in the text are mapped to lower case in regular expression 
matching. In addition, all upper case characters in regular expressions are mapped to 
lower case except in character class specifications. 


££ Version 3 only. 
* Bell ringing in open and visual on errors is not suppressed by setting noeb. 
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lisp default: nolisp 


Autoindent indents appropriately for lisp code, and the ( ) { } [{ and ]] commands in 
open and visual are modified to have meaning for lisp. 


list default: nolist 


All printed lines will be displayed (more) unambiguously, showing tabs and end-of-lines 
as in the list command. 


magic default: magic for ex and vit 


If nomagic is set, the number of regular expression metacharacters is greatly reduced, 
with only ‘{’ and ‘$’ having special effects. In addition the metacharacters “’ and ‘&’ of 
the replacement pattern are treated as normal characters. All the normal metacharac- 
ters may be made magic when nomagic is set by preceding them with a ‘N. 


mesg default: mesg 


Causes write permission to be turned off to the terminal while you are in visual mode, if 
nomesg is set. £4 


number, nu default: nonumber 


Causes all output lines to be printed with their line numbers. In addition each input 
line will be prompted for by supplying the line number it will have. 


open default: open 


If noopen, the commands open and visual are not permitted. This is set for edit to 
prevent confusion resulting from accidental entry to open or visual mode. 


optimize, opt default: optimize 


Throughput of text is expedited by setting the terminal to not do automatic carriage 
returns when printing more than one (logical) line of output, greatly speeding output on 
terminals without addressable cursors when text with leading white space is printed. 


paragraphs, para default: para=IPLPPPQPP LIbp 


Specifies the paragraphs for the { and } operations in open and visual. The pairs of 
characters in the option’s value are the names of the macros which start paragraphs. 


prompt default: prompt 
Command mode input is prompted for with a ‘:’. 


redraw default: noredraw 


The editor simulates (using great amounts of output), an intelligent terminal on a dumb 
terminal (e.g. during insertions in visual the characters to the right of the cursor posi- 
tion are refreshed as each input character is typed.) Useful only at very high speed. 


remap default: remap 


If on, macros are repeatedly tried until they are unchanged. {{ For example, if o is 
mapped to O, and O is mapped to I, then if remap is set, o will map to I, but if 
noremap is set, it will map to O. 


+ Nomagic for edit. 
££ Version 3 only. 
££ Version 3 only. 
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report default: report=5t 


Specifies a threshold for feedback from commands. Any command which modifies more 
than the specified number of lines will provide feedback as to the scope of its changes. 
For commands such as global, open, undo, and visual which have potentially more far 
reaching scope, the net change in the number of lines in the buffer is presented at the 
end of the command, subject to this same threshold. Thus notification is suppressed 
during a global command on the individual commands performed. 


scroll default: scroll=' window 


Determines the number of logical lines scrolled when an end-of-file is received from a 
terminal input in command mode, and the number of lines printed by a command mode 
z command (double the value of scroll). 


sections default: sections=SHNHH HU 


Specifies the section macros for the [[ and ]] operations in open and visual. The pairs of 
characters in the options’s value are the names of the macros which start paragraphs. 


shell, sh default: sh=/bin/sh 


Gives the path name of the shell forked for the shell escape command ‘!’, and by the 
shell command. The default is taken from SHELL in the environment, if present. 


shiftwidth, sw default: sw=8 


Gives the width a software tab stop, used in reverse tabbing with “D when using autoin- 
dent to append text, and by the shift commands. 


showmatch, sm default: nosm 


In open and visual mode, when a ) or } is typed, move the cursor to the matching ( or { 
for one second if this matching character is on the screen. Extremely useful with lisp. 


slowopen, slow terminal dependent 


Affects the display algorithm used in visual mode, holding off display updating during 
input of new text to improve throughput when the terminal in use is both slow and unin- 
telligent. See An Introduction to Display Editing with Vi for more details. 


tabstop, ts default: ts=8 
The editor expands tabs in the input file to be on tabstop boundaries for the purposes of 
display. 

taglength, tl default: tl=0 


Tags are not significant beyond this many characters. A value of zero (the default) 
means that all characters are significant. 


tags default: tags=tags /usr/lib/tags 


A path of files to be used as tag files for the tag command. {{ A requested tag is 
searched for in the specified files, sequentially. By default (even in version 2) files called 
tags are searched for in the current directory and in /usr/lib (a master file for the entire 
system.) 


+ 2 for edit. 
££ Version 3 only. 
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term from environment TERM 
The terminal type of the output device. 


terse default: noterse “ 


Shorter error diagnostics are produced for the experienced user. 


warn default: warn 
Warn if there has been ‘[No write since last change]’ before a ‘!’ command escape. 


window default: window=speed dependent 


The number of lines in a text window in the visual command. The default is 8 at slow 
speeds (600 baud or less), 16 at medium speed (1200 baud), and the full screen (minus 
one line) at higher speeds. 


w300, w1200, w9600 


These are not true options but set window only if the speed is slow (300), medium 
(1200), or high (9600), respectively. They are suitable for an EXINIT and make it easy 
to change the 8/16/full screen rule. 


wrapscan, ws default: ws 
Searches using the regular expressions in addressing will wrap around past the end of 
the file. 

wrapmargin, wm default: wm=0 


Defines a margin for automatic wrapover of text during input in open and visual modes. 
See An Introduction to Text Editing with Vi for details. 


writeany, wa default: nowa 


Inhibit the checks normally made before write commands, allowing a write to any file 
which the system protection mechanism will allow. 


10. Limitations 


Editor limits that the user is likely to encounter are as follows: 1024 characters per line, 
256 characters per global command list, 128 characters per file name, 128 characters in the 
previous inserted and deleted text in open or visual, 100 characters in a shell escape com- 
mand, 63 characters in a string valued option, and 30 characters in a tag name, and a limit of 
250000 lines in the file is silently enforced. 


The visual implementation limits the number of macros defined with map to 32, and the 
total number of characters in macros to be less than 512. 


Acknowledgments. Chuck Haley contributed greatly to the early development of ex. Bruce 
Englar encouraged the redesign which led to ex version 1. Bill Joy wrote versions 1 and 2.0 
through 2.7, and created the framework that users see in the present editor. Mark Horton 
added macros and other features and made the editor work on a large number of terminals 
and Unix systems. 
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Ex changes — Version 3.1 to 3.5 


This update describes the new features and changes which have been made in converting 


from version 3.1 to 3.5 of ex. Each change is marked with the first version where it appeared. 


Update to Ex Reference Manual 


Command line options 


3.4 
3.4 


A new command called view has been created. View is just like vi but it sets readonly. 


The encryption code from the v7 editor is now part of ex. You can invoke ex with the 
—x option and it will ask for a key, as ed. The ed x command (to enter encryption 
mode from within the editor) is not available. This feature may not be available in all 
instances of ex due to memory limitations. 


Commands 


3.4 


3.4 


3.3 


Provisions to handle the new process stopping features of the Berkeley TTY driver have 
been added. A new command, stop, takes you out of the editor cleanly and efficiently, 
returning you to the shell. Resuming the editor puts you back in command or visual 
mode, as appropriate. If autowrite is set and there are outstanding changes, a write is 
done first unless you say “stop!’’. 


A 
:vi <file> 

command from visual mode is now treated the same as a 
edit <file> or :ex <file> 


command. The meaning of the vi command from ex command mode is not affected. 


A new command mode command xit (abbreviated x) has been added. This is the same 
as wq but will not bother to write if there have been no changes to the file. 


Options 


3.4 


3.4 


3.3 


3.3 


A read only mode now lets you guarantee you won’t clobber your file by accident. You 
can set the on/off option readonly (ro), and writes will fail unless you use an ! after the 
write. Commands such as x, ZZ, the autowrite option, and in general anything that 
writes is affected. This option is turned on if you invoke ex with the —R flag. 


The wrapmargin option is now usable. The way it works has been completely 
revamped. Now if you go past the margin (even in the middle of a word) the entire word 
is erased and rewritten on the next line. This changes the semantics of the number 
given to wrapmargin. 0 still means off. Any other number is still a distance from the 
right edge of the screen, but this location is now the right edge of the area where wraps 
can take place, instead of the left edge. Wrapmargin now behaves much like 
fill/nojustify mode in nroff. 


The options w300, w1200, and w9600 can be set. They are synonyms for window, but 
only apply at 300, 1200, or 9600 baud, respectively. Thus you can specify you want a 12 
line window at 300 baud and a 23 line window at 1200 baud in your EXINIT with 


sset w300=12 w1200=23 


The new option timeout (default on) causes macros to time out after one second. Turn 
it off and they will wait forever. This is useful if you want multi character macros, but if 
your terminal sends escape sequences for arrow keys, it will be necessary to hit escape 
twice to get a beep. 


3.3 


3.3 
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The new option remap (default on) causes the editor to attempt to map the result of a 
macro mapping again until the mapping fails. This makes it possible, say, to map q to # 
and #1 to something else and get q1 mapped to something else. Turning it off makes it 
possible to map “L to 1 and map “R to “L without having “R map to I. 


The new (string) valued option tags allows you to specify a list of tag files, similar to the 
“path” variable of csh. The files are separated by spaces (which are entered preceded by 
a backslash) and are searched left to right. The default value is “tags /usr/lib/tags”, 
which has the same effect as before. It is recommended that “tags” always be the first 
entry. On Ernie CoVax, /usr/lib/tags contains entries for the system defined library pro- 
cedures from section 3 of the manual. 


Environment enquiries 


3.4 


The editor now adopts the convention that a null string in the environment is the same 
as not being set. This applies to TERM, TERMCAP, and EXINIT. 


Vi Tutorial Update 


Deleted features 


3.3 


3.0 


3.3 


The “q” command from visual no longer works at all. You must use “Q” to get to ex 
command mode. The “q” command was deleted because of user complaints about hit- 
ting it by accident too often. 


The provisions for changing the window size with a numeric prefix argument to certain 
visual commands have been deleted. The correct way to change the window size is to 
use the z command, for example z5<cr> to change the window to 5 lines. 


The option ”mapinput” is dead. It has been replaced by a much more powerful mechan- 
ism: ‘“:map!”’. 


Change in default option settings 


3.3 


The default window sizes have been changed. At 300 baud the window is now 8 lines (it 
was 1/2 the screen size). At 1200 baud the window is now 16 lines (it was 2/3 the screen 
size, which was usually also 16 for a typical 24 line CRT). At 9600 baud the window is 
still the full screen size. Any baud rate less than 1200 behaves like 300, any over 1200 
like 9600. This change makes vi more usable on a large screen at slow speeds. 


Vi commands 


3.3 


3.4 


3.3 


3.3 


3.4 


The command “ZZ” from vi is the same as “:x<cr>”. This is the recommended way to 
leave the editor. Z must be typed twice to avoid hitting it accidently. 


The command “Z is the same as “:stop<cr>”. Note that if you have an arrow key that 
sends “Z the stop function will take priority over the arrow function. If you have your 
“susp” character set to something besides “Z, that key will be honored as well. 


It is now possible from visual to string several search expressions together separated by 
semicolons the same as command mode. For example, you can say 


/foo/;/bar 
from visual and it will move to the first “bar” after the next “foo”. This also works 
within one line. 


“R is now the same as “L on terminals where the right arrow key sends “L (This includes 
the Televideo 912/920 and the ADM 31 terminals.) 


The visual page motion commands “F and “B now treat any preceding counts as number 
of pages to move, instead of changes to the window size. That is, 2°F moves forward 2 
pages. 
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Macros 


3.3 


3.4 


The “mapinput” mechanism of version 3.1 has been replaced by a more powerful 
mechanism. An “!” can follow the word “map” in the map command. Map!ed macros 
only apply during input mode, while map’ed macros only apply during command mode. 
Using “map” or “map!” by itself produces a listing of macros in the corresponding 
mode. 


A word abbreviation mode is now available. You can define abbreviations with the 
abbreviate command 


:abbr foo find outer otter 


which maps “foo” to “find outer otter’. Abbreviations can be turned off with the wnab- 
breviate command. The syntax of these commands is identical to the map and unmap 
commands, except that the ! forms do not exist. Abbreviations are considered when in 
visual input mode only, and only affect whole words typed in, using the conservative 
definition. (Thus “foobar” will not be mapped as it would using “map!”’) Abbreviate and 
unabbreviate can be abbreviated to “ab” and “una”, respectively. 
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SED — A Non-Interactive Text Editor 
Lee E. McMahon 


Bell Laboratories 
Murray Hill, New Jersey 07974 


Introduction 
Sed is a non-interactive context editor designed to be especially useful in three cases: 


1) To edit files too large for comfortable interactive editing; 

2) To edit any size file when the sequence of editing commands is too complicated to 
be comfortably typed in interactive mode; 

3) To perform multiple ‘global’ editing functions efficiently in one pass through the 
input. 


Since only a few lines of the input reside in core at one time, and no temporary files are used, 
the effective size of file that can be edited is limited only by the requirement that the input 
and output fit simultaneously into available secondary storage. 


Complicated editing scripts can be created separately and given to sed as a command file. For 
complex edits, this saves considerable typing, and its attendant errors. Sed running from a 
command file is much more efficient than any interactive editor known to the author, even if 
that editor can be driven by a pre-written script. 


The principal loss of functions compared to an interactive editor are lack of relative address- 
ing (because of the line-at-a-time operation), and lack of immediate verification that a com- 
mand has done what was intended. 


Sed is a lineal descendant of the UNIX editor, ed. Because of the differences between 
interactive and non-interactive operation, considerable changes have been made between ed 
and sed; even confirmed users of ed will frequently be surprised (and probably chagrined), if 
they rashly use sed without reading Sections 2 and 3 of this document. The most striking 
family resemblance between the two editors is in the class of patterns (‘regular expressions’) 
they recognize; the code for matching patterns is copied almost verbatim from the code for ed, 
and the description of regular expressions in Section 2 is copied almost verbatim from the 
UNIX Programmer’s Manual[1]. (Both code and description were written by Dennis M. 
Ritchie.) 


1. Overall Operation 


Sed by default copies the standard input to the standard output, perhaps performing one or 
more editing commands on each line before writing it to the output. This behavior may be 
modified by flags on the command line; see Section 1.1 below. 


The general format of an editing command is: 
[address1,address2][function][arguments] 


One or both addresses may be omitted; the format of addresses is given in Section 2. Any 
number of blanks or tabs may separate the addresses from the function. The function must 
be present; the available commands are discussed in Section 3. The arguments may be 
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required or optional, according to which function is given; again, they are discussed in Section 
3 under each individual function. 


Tab characters and spaces at the beginning of lines are ignored. 


1.1. Command-line Flags 


Three flags are recognized on the command line: 
-n: tells sed not to copy all lines, but only those specified by p functions or p flags 
after s functions (see Section 3.3); 
-e: tells sed to take the next argument as an editing command; 
-f: tells sed to take the next argument as a file name; the file should contain editing 
commands, one to a line. 


1.2. Order of Application of Editing Commands 


Before any editing is done (in fact, before any input file is even opened), all the editing com- 
mands are compiled into a form which will be moderately efficient during the execution phase 
(when the commands are actually applied to lines of the input file). The commands are com- 
piled in the order in which they are encountered; this is generally the order in which they will 
be attempted at execution time. The commands are applied one at a time; the input to each 
command is the output of all preceding commands. 


The default linear order of application of editing commands can be changed by the flow-of- 
control commands, t and b (see Section 3). Even when the order of application is changed by 
these commands, it is still true that the input line to any command is the output of any previ- 
ously applied command. 


1.3. Pattern-space 


The range of pattern matches is called the pattern space. Ordinarily, the pattern space is one 
line of the input text, but more than one line can be read into the pattern space by using the 
N command (Section 3.6.). 


1.4. Examples 
Examples are scattered throughout the text. Except where otherwise noted, the examples all 
assume the following input text: 


In Xanadu did Kubla Khan 

A stately pleasure dome decree: 
Where Alph, the sacred river, ran 
Through caverns measureless to man 
Down to a sunless sea. 


(In no case is the output of the sed commands to be considered an improvement on 
Coleridge.) 


Example: 
The command 
2q 
will quit after copying the first two lines of the input. The output will be: 


In Xanadu did Kubla Khan 
A stately pleasure dome decree: 
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2. ADDRESSES: Selecting lines for editing 


Lines in the input file(s) to which editing commands are to be applied can be selected by 
addresses. Addresses may be either line numbers or context addresses. 


The application of a group of commands can be controlled by one address (or address-pair) by 
grouping the commands with curly braces (‘{ }’)(Sec. 3.6.). 


2.1. Line-number Addresses 


A line number is a decimal integer. As each line is read from the input, a line-number 
counter is incremented; a line-number address matches (selects) the input line which causes 
the internal counter to equal the address line-number. The counter runs cumulatively 
through multiple input files; it is not reset when a new input file is opened. 


As a special case, the character $ matches the last line of the last input file. 


2.2. Context Addresses 


A context address is a pattern (‘regular expression’) enclosed in slashes (‘/’). The regular 
expressions recognized by sed are constructed as follows: 


1) An ordinary character (not one of those discussed below) is a regular expression, 
and matches that character. 


2) A circumflex “’ at the beginning of a regular expression matches the null character 
at the beginning of a line. 

3) A dollar-sign ‘$’ at the end of a regular expression matches the null character at the 
end of a line. 

4) The characters ‘\n’ match an imbedded newline character, but not the newline at 
the end of the pattern space. 

5) A period ‘.” matches any character except the terminal newline of the pattern space. 

6) A regular expression followed by an asterisk ‘*’ matches any number (including 0) 
of adjacent occurrences of the regular expression it follows. 

7) A string of characters in square brackets ‘[ ]’ matches any character in the string, 
and no others. If, however, the first character of the string is circumflex “”’, 
the regular expression matches any character except the characters in the 
string and the terminal newline of the pattern space. 

8) A concatenation of regular expressions is a regular expression which matches the - 
concatenation of strings matched by the components of the regular expression. 

9) A regular expression between the sequences ‘\(’ and ‘\)’ is identical in effect to the 
unadorned regular expression, but has side-effects which are described under 
the s command below and specification 10) immediately below. 

10) The expression ‘\d’ means the same string of characters matched by an expression 
enclosed in ‘\(’ and ‘\)’ earlier in the same pattern. Here d is a single digit; the 
string specified is that beginning with the dth occurrence of ‘\(’ counting from 
the left. For example, the expression “\(.*\)\l’ matches a line beginning with 
two repeated occurrences of the same string. 

11) The null regular expression standing alone (e.g., ‘//’) is equivalent to the last regu- 
lar expression compiled. 


To use one of the special characters (" $ . * [ ] \ /) as a literal (to match an occurrence of itself 
in the input), precede the special character by a backslash ‘Y. 

For a context address to ‘match’ the input requires that the whole pattern within the address 
match some portion of the pattern space. 


2.3. Number of Addresses 


The commands in the next section can have 0, 1, or 2 addresses. Under each command the 
maximum number of allowed addresses is given. For a command to have more addresses than 
the maximum allowed is considered an error. 
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If a command has no addresses, it is applied to every line in the input. 


If a command has one address, it is applied to all lines which match that address. 


If a command has two addresses, it is applied to the first line which matches the first address, 
and to all subsequent lines until (and including) the first subsequent line which matches the 
second address. Then an attempt is made on subsequent lines to again match the first 
address, and the process is repeated. 


Two addresses are separated by a comma. 


Examples: 
/an/ matches lines 1, 3, 4 in our sample text 
/an.*an/ matches line 1 
/‘an/ matches no lines 
|B matches all lines 
/\./ matches line 5 
/r*an/ matches lines 1,3, 4 (number = zero!) 


/\(an\).*\1/ matches line 1 


3. FUNCTIONS 


All functions are named by a single character. In the following summary, the maximum 
number of allowable addresses is given enclosed in parentheses, then the single character func- 
tion name, possible arguments enclosed in angles (< >), an expanded English translation of 
the single-character name, and finally a description of what each function does. The angles 
around the arguments are not part of the argument, and should not be typed in actual editing 


commands. 


3.1. Whole-line Oriented Functions 
(2)d -- delete lines 


The d function deletes from the file (does not write to the output) all those 
lines matched by its address(es). 


It also has the side effect that no further commands are attempted on the 
corpse of a deleted line; as soon as the d function is executed, a new line is 
read from the input, and the list of editing commands is re-started from the 
beginning on the new line. 


(2)n -- next line 


(1)a\ 


<text> 


The n function reads the next line from the input, replacing the current line. 
The current line is written to the output if it should be. The list of editing 
commands is continued following the n command. 


-- append lines 


The a function causes the argument <text> to be written to the output after 
the line matched by its address. The a command is inherently multi-line; a 
must appear at the end of a line, and <text> may contain any number of 
lines. To preserve the one-command-to-a-line fiction, the interior newlines 
must be hidden by a backslash character (‘\’) immediately preceding the new- 
line. The <text> argument is terminated by the first unhidden newline (the 
first one not immediately preceded by backslash). 


Once an a function is successfully executed, <text> will be written to the out- 
put regardless of what later commands do to the line which triggered it. The 
triggering line may be deleted entirely; <text> will still be written to the 


(1)i\ 


<text> 


(2)c\ 


<text> 
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output. 


The <text> is not scanned for address matches, and no editing commands are 
attempted on it. It does not cause any change in the line-number counter. 


-- insert lines 


The i function behaves identically to the a function, except that <text> is 
written to the output before the matched line. All other comments about the 
a function apply to the i function as well. 


-- change lines 


The c function deletes the lines selected by its address(es), and replaces them 
with the lines in <text>. Like a and i, c must be followed by a newline hid- 
den by a backslash; and interior new lines in <text> must be hidden by 
backslashes. 


The c command may have two addresses, and therefore select a range of lines. 
If it does, all the lines in the range are deleted, but only one copy of <text> is 
written to the output, not one copy per line deleted. As with a and i, <text> 
is not scanned for address matches, and no editing commands are attempted 
on it. It does not change the line-number counter. 


After a line has been deleted by a c function, no further commands are 
attempted on the corpse. 


If text is appended after a line by a or r functions, and the line is subse- 
quently changed, the text inserted by the c function will be placed before the 
text of the a or r functions. (The r function is described in Section 3.4.) 


Note: Within the text put in the output by these functions, leading blanks and tabs will disap- 
pear, as always in sed commands. To get leading blanks and tabs into the output, precede the 
first desired blank or tab by a backslash; the backslash will not appear in the output. 


Example: 


The list of editing commands: 


n 

a\ 
XXXX 
d 


applied to our standard input, produces: 
In Xanadu did Kubhla Khan 


XXXX 


Where Alph, the sacred river, ran 


XXXX 


Down to a sunless sea. 


In this particular case, the same effect would be produced by either of the two following com- 


mand lists: 


n 

i\ 
XXXX 
d 


n 
c\ 
XXXX 
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3.2. Substitute Function 


One very important function changes parts of lines selected by a context search within the 
line. 


(2)s<pattern><replacement><flags> -- substitute 


The s function replaces part of a line (selected by <pattern>) with <replace- 
ment>. It can best be read: 


Substitute for <pattern>, <replacement> 


The <pattern> argument contains a pattern, exactly like the patterns in 
addresses (see 2.2 above). The only difference between <pattern> and a con- 
text address is that the context address must be delimited by slash (‘/’) char- 
acters; <pattern> may be delimited by any character other than space or new- 
line. 


By default, only the first string matched by <pattern> is replaced, but see the 
g flag below. 


The <replacement> argument begins immediately after the second delimiting 
character of <pattern>, and must be followed immediately by another 
instance of the delimiting character. (Thus there are exactly three instances 
of the delimiting character.) 


The <replacement> is not a pattern, and the characters which are special in 
patterns do not have special meaning in <replacement>. Instead, other char- 
acters are special: 


& is replaced by the string matched by <pattern> 


\d (where d is a single digit) is replaced by the dth substring matched 
by parts of <pattern> enclosed in ‘\(’ and ‘\)’. If nested sub- 
strings occur in <pattern>, the dth is determined by counting 
opening delimiters (‘\(’). 


As in patterns, special characters may be made literal by 
preceding them with backslash (‘\). 


The <flags> argument may contain the following flags: 


g -- substitute <replacement> for all (non-overlapping) instances of 
<pattern> in the line. After a successful substitution, the 
scan for the next instance of <pattern> begins just after the 
end of the inserted characters; characters put into the line 
from <replacement> are not rescanned. 


p -- print the line if a successful replacement was done. The p flag 
causes the line to be written to the output if and only if a sub- 
stitution was actually made by the s function. Notice that if 
several s functions, each followed by a p flag, successfully sub- 
stitute in the same input line, multiple copies of the line will 
be written to the output: one for each successful substitution. 


w <filename> -- write the line to a file if a successful replacement was 
done. The w flag causes lines which are actually substituted 
by the s function to be written to a file named by <filename>. 
If <filename> exists before sed is run, it is overwritten; if not, 
it is created. 


A single space must separate w and <filename>. 
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The possibilities of multiple, somewhat different copies of one 
input line being written are the same as for p. 


A maximum of 10 different file names may be mentioned after 
w flags and w functions (see below), combined. 


Examples: 

The following command, applied to our standard input, 
s/to/by/w changes 

produces, on the standard output: 


In Xanadu did Kubhla Khan 

A stately pleasure dome decree: 
Where Alph, the sacred river, ran 
Through caverns measureless by man 
Down by a sunless sea. 


and, on the file ‘changes’: 


Through caverns measureless by man 
Down by a sunless sea. 


If the nocopy option is in effect, the command: 
s/[.,;?:]/*P&*/gp 
produces: 


A stately pleasure dome decree*P:* 
Where Alph*P,* the sacred river*P,* ran 
Down to a sunless sea*P.* 


Finally, to illustrate the effect of the g flag, the command: 
/[X/s/an/AN/p 
produces (assuming nocopy mode): 
In XANadu did Kubhla Khan 
and the command: 
/X/s/an/AN/gp 
produces: 
In XANadu did Kubhla KhAN 


3.3. Input-output Functions 
(2)p -- print 


The print function writes the addressed lines to the standard output file. 
They are written at the time the p function is encountered, regardless of what 
succeeding editing commands may do to the lines. 


(2)w <filename> -- write on <filename> 


The write function writes the addressed lines to the file named by <filename>. 
If the file previously existed, it is overwritten; if not, it is created. The lines 
are written exactly as they exist when the write function is encountered for 
each line, regardless of what subsequent editing commands may do to them. 


Exactly one space must separate the w and <filename>. 


A maximum of ten different files may be mentioned in write functions and w 
flags after s functions, combined. 
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(1)r <filename> -- read the contents of a file 


The read function reads the contents of <filename>, and appends them after 
the line matched by the address. The file is read and appended regardless of 
what subsequent editing commands do to the line which matched its address. 
If r and a functions are executed on the same line, the text from the a func- 
tions and the r functions is written to the output in the order that the func- 
tions are executed. 


Exactly one space must separate the r and <filename>. If a file mentioned by 
ar function cannot be opened, it is considered a null file, not an error, and no 
diagnostic is given. 

NOTE: Since there is a limit to the number of files that can be opened simultaneously, care 


should be taken that no more than ten files be mentioned in w functions or flags; that number 
is reduced by one if any r functions are present. (Only one read file is open at one time.) 


Examples 
Assume that the file ‘notel’ has the following contents: 
Note: Kubla Khan (more properly Kublai Khan; 1216-1294) was the grandson 
and most eminent successor of Genghiz (Chingiz) Khan, and founder of the 
Mongol dynasty in China. 
Then the following command: 
/Kubla/r notel 
produces: 
In Xanadu did Kubla Khan 
Note: Kubla Khan (more properly Kublai Khan; 1216-1294) was the grandson 
and most eminent successor of Genghiz (Chingiz) Khan, and founder of the 
Mongol dynasty in China. 
A stately pleasure dome decree: 
Where Alph, the sacred river, ran 
Through caverns measureless to man 
Down to a sunless sea. 


3.4. Multiple Input-line Functions 


Three functions, all spelled with capital letters, deal specially with pattern spaces containing 
imbedded newlines; they are intended principally to provide pattern matches across lines in 
the input. 


(2)N -- Next line 


The next input line is appended to the current line in the pattern space; the 
two input lines are separated by an imbedded newline. Pattern matches may 
extend across the imbedded newline(s). 


(2)D -- Delete first part of the pattern space 


Delete up to and including the first newline character in the current pattern 
space. If the pattern space becomes empty (the only newline was the terminal 
newline), read another line from the input. In any case, begin the list of edit- 
ing commands again from its beginning. 


(2)P -- Print first part of the pattern space 
Print up to and including the first newline in the pattern space. 


The P and D functions are equivalent to their lower-case counterparts if there are no 
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imbedded newlines in the pattern space. 


3.5. Hold and Get Functions 
Four functions save and retrieve part of the input for possible later use. 
(2)h -- hold pattern space 


The h functions copies the contents of the pattern space into a hold area (des- 
troying the previous contents of the hold area). 

(2)H -- Hold pattern space 
The H function appends the contents of the pattern space to the contents of 
the hold area; the former and new contents are separated by a newline. 

(2)g -- get contents of hold area 


The g function copies the contents of the hold area into the pattern space 
(destroying the previous contents of the pattern space). 


(2)G -- Get contents of hold area 


The G function appends the contents of the hold area to the contents of the 
pattern space; the former and new contents are separated by a newline. 
(2)x -- exchange 


The exchange command interchanges the contents of the pattern space and 
the hold area. 


Example 
The commands 


th 
1s/ did.*// 
lx 


G 
s/\n/_:/ 
applied to our standard example, produce: 
In Xanadu did Kubla Khan :In Xanadu 
A stately pleasure dome decree: :In Xanadu 
Where Alph, the sacred river, ran :In Xanadu 


Through caverns measureless to man :In Xanadu 
Down to a sunless sea. :In Xanadu 


3.6. Flow-of-Control Functions 
These functions do no editing on the input lines, but control the application of functions to 
the lines selected by the address part. 
(2)! -- Don’t 
The Don’t command causes the next command (written on the same line), to 
be applied to all and only those input lines not selected by the adress part. 
(2){ -- Grouping 


The grouping command ‘{’ causes the next set of commands to be applied (or 
not applied) as a block to the input lines selected by the addresses of the 
grouping command. The first of the commands under control of the grouping 
may appear on the same line as the ‘{’ or on the next line. 
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The group of commands is terminated by a matching ‘}’ standing on a line by 
itself. 


Groups can be nested. 
(0):<label> -- place a label 


The label function marks a place in the list of editing commands which may 
be referred to by b and ¢t functions. The <label> may be any sequence of 
eight or fewer characters; if two different colon functions have identical labels, 
a compile time diagnostic will be generated, and no execution attempted. 


(2)b<label> -- branch to label 


The branch function causes the sequence of editing commands being applied 
to the current input line to be restarted immediately after the place where a 
colon function with the same <label> was encountered. If no colon function 
with the same label can be found after all the editing commands have been 
compiled, a compile time diagnostic is produced, and no execution is 
attempted. 


A 6 function with no <label> is taken to be a branch to the end of the list of 
editing commands; whatever should be done with the current input line is 
done, and another input line is read; the list of editing commands is restarted 
from the beginning on the new line. 


(2)t<label> -- test substitutions 


The ¢ function tests whether any successful substitutions have been made on 
the current input line; if so, it branches to <label>; if not, it does nothing. 
The flag which indicates that a successful substitution has been executed is 
reset by: 


1) reading a new input line, or 
2) executing a t function. 


3.7. Miscellaneous Functions 
(1)= -- equals 
The = function writes to the standard output the line number of the line 
matched by its address. 
(1)q -- quit 


The q function causes the current line to be written to the output (if it should 
be), any appended or read text to be written, and execution to be terminated. 


Reference 


[1] Ken Thompson and Dennis M. Ritchie, The UNIX Programmer’s Manual. Bell Labora- 
tories, 1978. 
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PART 4: COMMAND INTERPRETERS 


A shell is a command interpreter, an interface between a user and the operating system. The 
ULTRIX-32 system provides two shells: the Bourne Shell (the UNIX System 7 shell) and the 
C Shell (the Berkeley shell). Each shell allows users to communicate with the ULTRIX-32 
system to call editors, compilers, and other utilities, and to manipulate files. Figure 1-1 shows 
how the shells relate to the ULTRIX-32 system utilities. 


Program Development Tools 
File Manipulation Tools 
Communication Tools 
System Administration Tools 
Text Formatters 

Compilers 
Editors 
Mail 








Bourne Shell 
Cc Shell 


Figure 1-1 Shells in the ULTRIX-32 System 


When yow-use a shell interactively, it serves as a command language; when you write and exe- 
cute a sequence of shell commands, the shell serves as a programming language. Both shells 
offer features for flow control, parameter substitution, shell variables, fault trapping, and 
debugging. The Bourne Shell was written first. The C Shell was developed to provide addi- 
tional interactive features. It is called the C Shell because its command language, syntax, and 
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control flow are similar to the C programming language. The two shells are, in general, not 
compatible; programs written for the Bourne Shell will not run on the C Shell without altera- 
tion. You can set up your login file to permanently establish one of these shells as your 
default shell. . . 


This part includes an article describing each shell. If you choose to use the C Shell, you will 
find both articles useful. If you use the Bourne Shell, skip the “Introduction to the C Shell.” 


The first article, “An Introduction to the UNIX Shell,” by S. R. Bourne, explains the Bourne 
Shell concepts, commands, and command formats, and it demonstrates all major features with 
examples and explanations. The two appendixes at the end of the article make a handy refer- 
ence: “Grammar” and “Metacharacters and Reserved Words.” 


The “Introduction to the C Shell,” by William Joy, is more expansive in its examples and 
explanations than the Bourne article, and it concentrates more on interactive use of the shell. 
The article documents all features unique to the C Shell, including history, aliases, argument 
expansion, C language-type arithmetic operations, and job control. A handy glossary at the 
end of the article defines C Shell commands and concepts. 


As you read these articles, refer to the ULTRIX-32 Programmers Manual, Binder 1. It gives 
detailed specifications for each command. The shell articles in this volume provide a back- 
ground for those specifications. Bourne and Joy show how to coordinate the commands to 
- produce useful results. 
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An Introduction to the UNIX Shell 
S. R. Bourne 


Bell Laboratories 
Murray Hill, New Jersey 07974 


1.0 Introduction 


The shell is both a command language and a programming language that provides an interface 
to the UNIX operating system. This memorandum describes, with examples, the UNIX shell. 
The first section covers most of the everyday requirements of terminal users. Some familiar- 
ity with UNIX is an advantage when reading this section; see, for example, "UNIX for 
beginners”.! Section 2 describes those features of the shell primarily intended for use within 
shell procedures. These include the control-flow primitives and string-valued variables pro- 
vided by the shell. A knowledge of a programming language would be a help when reading 
this section. The last section describes the more advanced features of the shell. References of 
the form ”see pipe (2)” are to a section of the UNIX manual.” 


1.1 Simple commands 


Simple commands consist of one or more words separated by blanks. The first word is the 
name of the command to be executed; any remaining words are passed as arguments to the 
command. For example, 


who 
is a command that prints the names of users logged in. The command 
Is 
prints a list of files in the current directory. The argument 4 tells Js to print status informa- 


tion, size and the creation date for each file. 


1.2 Background commands 
To execute a command the shell normally creates a new process and waits for it to finish. A 
command may be run without waiting for it to finish. For example, 

cc pgm.c & 


calls the C compiler to compile the file pgm.c. The trailing & is an operator that instructs the 
shell not to wait for the command to finish. To help keep track of such a process the shell 
reports its process number following its creation. A list of currently active processes may be 
obtained using the ps command. 


1.3 Input output redirection 


Most commands produce output on the standard output that is initially connected to the ter- 
minal. This output may be sent to a file by writing, for example, 


UNIX is a Trademark of Bell Laboratories 
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ls -1 >file 


The notation >file is interpreted by the shell and is not passed as an argument to ls. If file 
does not exist then the shell creates it; otherwise the original contents of file are replaced with 
the output from Js. Output may be appended to a file using the notation 


ls  >>file 


In this case file is also created if it does not already exist. 


The standard input of a command may be taken from a file instead of the terminal by writing, 
for example, 


we <file 


The command we reads its standard input (in this case redirected from file) and prints the 
number of characters, words and lines found. If only the number of lines is required then 


wc—1 <file 


could be used. 


1.4 Pipelines and filters 


The standard output of one command may be connected to the standard input of another by 
writing the ‘pipe’ operator, indicated by |, as in, 


Is-l | we 


Two commands connected in this way constitute a pipeline and the overall effect is the same 
as 


Is -1 >file; we <file 


except that no file is used. Instead the two processes are connected by a pipe (see pipe (2)) 
and are run in parallel. Pipes are unidirectional and synchronization is achieved by halting we 
when there is nothing to read and halting Js when the pipe is full. 


A filter is a command that reads its standard input, transforms it in some way, and prints the 
result as output. One such filter, grep, selects from its input those lines that contain some 
specified string. For example, 


ls | grep old 


prints those lines, if any, of the output from /s that contain the string old. Another useful 
filter is sort. For example, 


who | sort 


will print an alphabetically sorted list of logged in users. 


A pipeline may consist of more than two commands, for example, 
ls | grep old | we 


prints the number of file names in the current directory containing the string old. 


1.5 File name generation 
Many commands accept arguments which are file names. For example, 
Isl main.c 


prints information relating to the file main.c. 


The shell provides a mechanism for generating a list of file names that match a pattern. For 
example, 
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Is] *.¢ 


generates, as arguments to Is, all file names in the current directory that end in .c. The char- 
acter * is a pattern that will match any string including the null string. In general patterns 
are specified as follows. 


* Matches any string of characters including the null string. 
? Matches any single character. 
[...] Matches any one of the characters enclosed. A pair of characters separated by a 


minus will match any character lexically between the pair. 


For example, 
[a-z] * 

matches all names in the current directory beginning with one of the letters a through z. 
/usr/fred/test/? 


matches all names in the directory /usr/fred/test that consist of a single character. If no file 
name is found that matches the pattern then the pattern is passed, unchanged, as an argu- 
ment. 


This mechanism is useful both to save typing and to select names according to some pattern. 
It may also be used to find files. For example, 


echo /usr/fred/*/core 


finds and prints the names of all core files in sub-directories of /usr/fred. (echo is a stan- 
dard UNIX command that prints its arguments, separated by blanks.) This last feature can be 
expensive, requiring a scan of all sub-directories of /usr/fred. 


There is one exception to the general rules given for patterns. The character ‘.’ at the start of 
a file name must be explicitly matched. 


echo * 
will therefore echo all file names in the current directory not beginning with ‘.’. 
echo .* 


will echo all those file names that begin with ‘.’. This avoids inadvertent matching of the 


names ‘.’ and ‘..” which mean ‘the current directory’ and ‘the parent directory’ respectively. 


(Notice that /s suppresses information for the files ‘.” and ‘..’.) 


1.6 Quoting 


Characters that have a special meaning to the shell, such as < > * ? | &, are called metachar- 
acters. A complete list of metacharacters is given in appendix B. Any character preceded by 
a \is quoted and loses its special meaning, if any. The \ is elided so that 


echo \? 
will echo a single ?, and 
echo \\ 


will echo a single \. To allow long strings to be continued over more than one line the 
sequence newline is ignored. 


\is convenient for quoting single characters. When more than one character needs quoting 
the above mechanism is clumsy and error prone. A string of characters may be quoted by 
enclosing the string between single quotes. For example, 


echo xx’ ** **°xx 
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will echo 

XX* KK KXX 
The quoted string may not contain a single quote but may contain newlines, which are 
preserved. This quoting mechanism is the most simple and is recommended for casual use. 


A third quoting mechanism using double quotes is also available that prevents interpretation 
of some but not all metacharacters. Discussion of the details is deferred to section 3.4. 


1.7 Prompting 
When the shell is used from a terminal it will issue a prompt before reading a command. By 
default this prompt is ‘$ ’. It may be changed by saying, for example, 

PS1=yesdear 


that sets the prompt to be the string yesdear. If a newline is typed and further input is 
needed then the shell will issue the prompt ‘> ’. Sometimes this can be caused by mistyping 
a quote mark. If it is unexpected then an interrupt (DEL) will return the shell to read 
another command. This prompt may be changed by saying, for example, 


PS2=more 


1.8 The shell and login 


Following login (1) the shell is called to read and execute commands typed at the terminal. If 
the user’s login directory contains the file .profile then it is assumed to contain commands 
and is read by the shell before reading any commands from the terminal. 


1.9 Summary 


° Is 
Print the names of files in the current directory. 


° Is >file 
Put the output from /s into file. 


° Is | wel 

Print the number of files in the current directory. 
e Is | grep old 

Print those file names containing the string old. 


° Is | grep old | we] 
Print the number of files whose name contains the string old. 


° ce pgm.c & 
Run cc in the background. 
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The shell may be used to read and execute commands contained in a file. For example, 
sh file [ args ... ] 


calls the shell to read commands from file. Such a file is called a command procedure or shell 
procedure. Arguments may be supplied with the call and are referred to in file using the posi- 
tional parameters $1, $2, .... For example, if the file wg contains 


who | grep $1 
then 

sh wg fred 
is equivalent to 


who | grep fred 


UNIX files have three independent attributes, read, write and execute. The UNIX command 
chmod (1) may be used to make a file executable. For example, 


chmod +x wg 

will ensure that the file wg has execute status. Following this, the command 
wg fred 

is equivalent to 
sh wg fred 


This allows shell procedures and programs to be used interchangeably. In either case a new 
process is created to run the command. 


As well as providing names for the positional parameters, the number of positional parameters 
in the call is available as $#. The name of the file being executed is available as $0. 


A special shell parameter $* is used to substitute for all positional parameters except $0. A 
typical use of this is to provide some default arguments, as in, 


nroff ~-T450 —ms $* 


which simply prepends some arguments to those already given. 


2.1 Control flow - for 


A frequent use of shell procedures is to loop through the arguments ($1, $2, ...) executing 
commands once for each argument. An example of such a procedure is tel that searches the 
file /usr/lib/telnos that contains lines of the form . 


fred mh0123 
bert mh0789 


The text of tel is 


for i 
do grep $i /usr/lib/telnos; done 


The command 
tel fred 


prints those lines in /usr/lib/telnos that contain the string fred. 
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tel fred bert 


prints those lines containing fred followed by those for bert. 
The for loop notation is recognized by the shell and has the general form 
for name in wl w2... 


do commanda-list 
done 


A command-list is a sequence of one or more simple commands separated or terminated by a 
newline or semicolon. Furthermore, reserved words like do and done are only recognized fol- 
lowing a newline or semicolon. name is a shell variable that is set to the words wl w2... in 
turn each time the command-list following do is executed. If in wl w2... is omitted then 
the loop is executed once for each positional parameter; that is, in $* is assumed. 


Another example of the use of the for loop is the create command whose text is 
for i do >$i; done 

The command 
create alpha beta 


ensures that two empty files alpha and beta exist and are empty. The notation >file may be 
used on its own to create or clear the contents of a file. Notice also that a semicolon (or new- 
line) is required before done. 


2.2 Control flow - case . 
A multiple way branch is provided for by the case notation. For example, 


case $# in 

1) cat >>$1 ;; 

2) cat >>$2 <$1 ;; 

*) echo “usage: append [ from ] to’ ;; 
esac 


is an append command. When called with one argument as 
append file 


$# is the string 7 and the standard input is copied onto the end of file using the cat com- 
mand. 


append file1 file2 


appends the contents of filel onto file2. If the number of arguments supplied to append is 
other than 1 or 2 then a message is printed indicating proper usage. 


The general form of the case command is 


case word in 
pattern) command-list;; 


esac 


The shell attempts to match word with each pattern, in the order in which the patterns 
appear. If a match is found the associated command-list is executed and execution of the 
case is complete. Since * is the pattern that matches any string it can be used for the 
default case. 


A word of caution: no check is made to ensure that only one pattern matches the case argu- 
ment. The first match found defines the set of commands to be executed. In the example 
below the commands following the second * will never be executed. 
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case $# in 
#) vest 
#) seach 
esac 


Another example of the use of the case construction is to distinguish between different forms 
of an argument. The following example is a fragment of a cc command. 


for i 

do case $i in 
-[ocs]) ...33 
—*) echo ‘unknown flag $i’ ;; 
*.c) — /lib/cO $i... 3; 
*)echo “unexpected argument $i’ ;; 
esac 

done 


To allow the same commands to be associated with more than one pattern the case command 
provides for alternative patterns separated by a |. For example, 


case $i in 


-x| ay)... 
esac 


is equivalent to 


case $i in 
—[xy])... 
esac 
The usual quoting conventions apply so that 
case $i in 
\?) 


will match the character ?. 


2.3 Here documents 


The shell procedure tel in section 2.1 uses the file /usr/lib/telnos to supply the data for grep. 
An alternative is to include this data within the shell procedure as a here document, as in, 


for i 
do grep $i <! 


fred mh0123 
bert mh0789 


' 
done 


In this example the shell takes the lines between <! and! as the standard input for grep. 


The string ! is arbitrary, the document being terminated by a line that consists of the string 
- following <<. 


Parameters are substituted in the document before it is made available to grep as illustrated 
by the following procedure called edg. 
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ed $3 <<% 
g/$1/s//$2/g 
Ww 

% 


The call 
edg string] string2 file 
is then equivalent to the command 


ed file <<% 
g/string1/s//string2/g 
Ww 


% 


and changes all occurrences of string] in file to string2. Substitution can be prevented using \ 
to quote the special character $ as in 


ed $3 <<+ 
1,\$s/$1/$2/g 
WwW 

+ 


(This version of edg is equivalent to the first except that ed will print a ? if there are no 
occurrences of the string $1.) Substitution within a here document may be prevented entirely 
by quoting the terminating string, for example, 


grep $i <<\# 
# 
The document is presented without modification to grep. If parameter substitution is not 


required in a here document this latter form is more efficient. 


2.4 Shell variables 


The shell provides string-valued variables. Variable names begin with a letter and consist of 
letters, digits and underscores. Variables may be given values by writing, for example, 


user=fred box=m000 acct=mh0000 


which assigns values to the variables user, box and acct. A variable may be set to the null 
string by saying, for example, 


null= 
The value of a variable is substituted by preceding its name with $; for example, 
echo $user 


will echo fred. 


Variables may be used interactively to provide abbreviations for frequently used strings. For 
example, 


b=/usr/fred/bin 
mv pgm $b 


will move the file pgm from the current directory to the directory /usr/fred/bin. A more 
general notation is available for parameter (or variable) substitution, as in, 


echo ${user} 


which is equivalent to 
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echo $user | 
and is used when the parameter name is followed by a letter or digit. For example, 
tmp=/tmp/ps 
ps a >${tmp}a 
will direct the output of ps to the file /tmp/psa, whereas, 
ps a >$tmpa 
would cause the value of the variable tmpa to be substituted. 


Except for $? the following are set initially by the shell. $? is set after executing each com- 
mand. 


$? The exit status (return code) of the last command executed as a decimal string. 
Most commands return a zero exit status if they complete successfully, other- 
wise a non-zero exit status is returned. Testing the value of return codes is 
dealt with later under if and while commands. 


$# The number of positional parameters (in decimal). Used, for example, in the 
append command to check the number of parameters. 


$$ The process number of this shell (in decimal). Since process numbers are 
unique among all existing processes, this string is frequently used to generate 
unique temporary file names. For example, 


ps a >/tmp/ps$$ 


Pr /tmp/ps$$ 


$! The process number of the last process run in the background (in decimal). 
$- The current shell flags, such as —-x and -v. 
Some variables have a special meaning to the shell and should be avoided for general use. 


$MAIL When used interactively the shell looks at the file specified by this variable 
before it issues a prompt. If the specified file has been modified since it was 
last looked at the shell prints the message you have mail before prompting for 
the next command. This variable is typically set in the file .profile, in the 
user’s login directory. For example, 


MAIL=/usr/mail/fred 


$HOMEThe default argument for the cd command. The current directory is used to 
resolve file name references that do not begin with a /, and is changed using the 
cd command. For example, 


cd /usr/fred/bin 
makes the current directory /usr/fred/bin. 
cat wn 


will print on the terminal the file wn in this directory. The command cd with 
no argument is equivalent to 


cd $HOME 


This variable is also typically set in the the user’s login profile. 


$PATH A list of directories that contain commands (the search path). Each time a 
command is executed by the shell a list of directories is searched for an execut- 
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able file. If $PATH is not set then the current directory, /bin, and /usr/bin 
are searched by default. Otherwise $PATH consists of directory names 
separated by:. For example, 


PATH=:/usr/fred/bin:/bin:/usr/bin 


specifies that the current directory (the null string before the first ;), 
/usr/fred/bin, /bin and /usr/bin are to be searched in that order. In this way 
individual users can have their own ‘private’ commands that are accessible 
independently of the current directory. If the command name contains a / then 

' this directory search is not used; a single attempt is made to execute the com- 
mand. 


$PS1 The primary shell prompt string, by default, ‘$ ’. 
$PS2 The shell prompt when further input is needed, by default, ‘> ’. 
$IFS The set of characters used by blank interpretation (see section 3.4). 


2.5 The test command 


The test command, although not part of the shell, is intended for use by shell programs. For 
example, 


test f file 


returns zero exit status if file exists and non-zero exit status otherwise. In general test evalu- 
ates a predicate and returns the result as its exit status. Some of the more frequently used 
test arguments are given here, see test (1) for a complete specification. 


test s true if the argument s is not the null string 
test -f file true if file exists 

test -r file true if file is readable 

test -w file true if file is writable 

test—d file true if file is a directory 


2.6 Control flow - while 
The actions of the for loop and the case branch are determined by data available to the 


shell. A while or until loop and an if then else branch are also provided whose actions are 
determined by the exit status returned by commands. A while loop has the general form 


while command-list, 
do command-list, 
done 


The value tested by the while command is the exit status of the last simple command follow- 
ing while. Each time round the loop command-list, is executed; if a zero exit status is 
returned then command-list, is executed; otherwise, the loop terminates. For example, 


while test $1 


is equivalent to 


for i 
do... 
done 


shift is a shell command that renames the positional parameters $2, $3, ... as $1, $2,... and 
loses $1. 
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Another kind of use for the while/until loop is to wait until some external event occurs and 
then run some commands. In an until loop the termination condition is reversed. For exam- 
ple, 


until test -f file 
do sleep 300; done 
commands 


will loop until file exists. Each time round the loop it waits for 5 minutes before trying again. 
(Presumably another process will eventually create the file.) 


2.7 Control flow - if 
Also available is a general conditional branch of the form, 


if command-list 
then command-list 
else command-list 
fi 


that tests the value returned by the last simple command following if. 


The if command may be used in conjunction with the test command to test for the existence 
of a file as in 


if test -f file 

then process file 

else do something else 
fi 


An example of the use of if, case and for constructions is given in section 2.10. 
A multiple test if command of the form 


Tess 

then ... 

else if... 
then ... 
else if... 

fi 

fi 

fi 


may be written using an extension of the if notation as, 


ME ce 
then ... 
elif 
then ... 
elif... 


fi 
The following example is the touch command which changes the ‘last modified’ time for a list 


of files. The command may be used in conjunction with make (1) to force recompilation of a 
list of files. 
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flag= 
for i 
do case $i in 
—C) flag=N ;; 
*)if test -f $i 
then In $i junk$$; rm junk$$ 
elif test $flag 
then echo file \$i\ does not exist 
else >$i 
fi 
esac 
done 


The -c flag is used in this command to force subsequent files to be created if they do not 
already exist. Otherwise, if the file does not exist, an error message is printed. The shell vari- 
able flag is set to some non-null string if the -c argument is encountered. The commands 


In ...; rm... 


make a link to the file and then remove it thus causing the last modified date to be updated. 
The sequence 


if command1 
then command2 
fi 


may be written 
command! && command2 
Conversely, 
command! || command2 
executes command2 only if command! fails. In each case the value returned is that of the last 


simple command executed. 


2.8 Command grouping 


Commands may be grouped in two ways, 
{ command-list ; } 
and 


( commanda-list ) 


In the first command-list is simply executed. The second form executes command-list as a 
separate process. For example, 


(cd x; rm junk ) 


executes rm junk in the directory x without changing the current directory of the invoking 
shell. 


The commands 
cd x; rm junk 


have the same effect but leave the invoking shell in the directory x. 
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2.9 Debugging shell procedures 
The shell provides two tracing mechanisms to help when debugging shell procedures. The 
first is invoked within the procedure as 

set -v 
(v for verbose) and causes lines of the procedure to be printed as they are read. It is useful to 
help isolate syntax errors. It may be invoked without modifying the procedure by saying 

sh -v proc... 


where proc is the name of the shell procedure. This flag may be used in conjunction with the 
~n flag which prevents execution of subsequent commands. (Note that saying set -n at a ter- 
minal will render the terminal useless until an end-of-file is typed.) 


The command 
set —x 


will produce an execution trace. Following parameter substitution edch command is printed 
as it is executed. (Try these at the terminal to see what effect they have.) Both flags may be 
turned off by saying 


set — 


and the current setting of the shell flags is available as $-. 


2.10 The man command 


The following is the man command which is used to print sections of the UNIX manual. It is 
called, for example, as 


man sh 
man -t ed 
man 2 fork 


In the first the manual section for sh is printed. Since no section is specified, section 1 is 
used. The second example will typeset (t option) the manual section for ed. The last prints 
the fork manual page from section 2. 
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cd /usr/man 


: ‘colon is the comment command’ 
: “default is nroff ($N), section 1 ($s)’ 
N=n s=1 


for i 
do case $i in 


[1-9] *) s=$i a 

+) N=t 5; 

—n) N=n ;; . 

—*) echo unknown flag \‘$iY ;; 


*)if test-f man$s/$i.$s 
then ${N}roff man0/${N}aa man$s/$i.$s 
else: ‘look through all manual sections’ 
found=no 
for jinl1 23456789 
do if test - man$j/$i.$j 
then man $j $i 
found=yes 
fi 
done 
case $found in 
no) echo ‘$i: manual page not found’ 
esac 


esac 
done 


Figure 1. A version of the man command 
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3.0 Keyword parameters 


Shell variables may be given values by assignment or when a shell procedure is invoked. An 
argument to a shell procedure of the form name=value that precedes the command name 
causes value to be assigned to name before execution of the procedure begins. The value of 
name in the invoking shell is not affected. For example, 


user=fred command 


will execute command with user set to fred. The -k flag causes arguments of the form 
name=value to be interpreted in this way anywhere in the argument list. Such names are 
sometimes called keyword parameters. If any arguments remain they are available as posi- 
tional parameters $1, $2, .... 

The set command may also be used to set positional parameters from within a procedure. For 
example, 


set — * 


will set $1 to the first file name in the current directory, $2 to the next, and so on. Note that 
the first argument, —, ensures correct treatment when the first file name begins with a-—. 


3.1 Parameter transmission 


When a shell procedure is invoked both positional and keyword parameters may be supplied 
with the call. Keyword parameters are also made available implicitly to a shell procedure by 
specifying in advance that such parameters are to be exported. For example, 


export user box 


marks the variables user and box for export. When a shell procedure is invoked copies are 
made of all exportable variables for use within the invoked procedure. Modification of such 
variables within the procedure does not affect the values in the invoking shell. It is generally 
true of a shell procedure that it may not modify the state of its caller without explicit request 
on the part of the caller. (Shared file descriptors are an exception to this rule.) 


Names whose value is intended to remain constant may be declared readonly. The form of 
this command is the same as that of the export command, 


readonly name... 


Subsequent attempts to set readonly variables are illegal. 


3.2 Parameter substitution 


If a shell parameter is not set then the null string is substituted for it. For example, if the 
variable d is not set 


echo $d 

or 
echo ${d} 

will echo nothing. A default string may be given as in 
echo ${d-} 


which will echo the value of the variable d if it is set and ‘.” otherwise. The default string is 
evaluated using the usual quoting conventions so that 


echo ${d—*’} 


will echo * if the variable d is not set. Similarly 
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echo ${d-$1} 


will echo the value of d if it is set and the value (if any) of $1 otherwise. A variable may be 
assigned a default value using the notation 


echo ${d=.} 
which substitutes the same string as 
echo ${d-.} 
and if d were not previously set then it will be set to the string ‘.’. (The notation ${...=...} is 
not available for positional parameters.) 
If there is no sensible default then the notation 


echo ${d?message} 


will echo the value of the variable d if it has one, otherwise message is printed by the shell 
and execution of the shell procedure is abandoned. If message is absent then a standard mes- 
sage is printed. A shell procedure that requires some parameters to be set might start as fol- 
lows. 


: ${user?} ${acct?} ${bin?} 


Colon (:) is a command that is built in to the shell and does nothing once its arguments have 
been evaluated. If any of the variables user, acct or bin are not set then the shell will aban- 
don execution of the procedure. 


3.3 Command substitution 


The standard output from a command can be substituted in a similar way to parameters. The 
command pwd prints on its standard output the name of the current directory. For example, 
if the current directory is /usr/fred/bin then the command 


d=‘pwd* 
is equivalent to 
d=/usr/fred/bin 
The entire string between grave accents (...) is taken as the command to be executed and is 


replaced with the output from the command. The command is written using the usual quot- 
ing conventions except that a ‘ must be escaped using a\. For example, 


Is “echo ”$1”* 
is equivalent to 
Is $1 


Command substitution occurs in all contexts where parameter substitution occurs (including 
here documents) and the treatment of the resulting text is the same in both cases. This 
mechanism allows string processing commands to be used within shell procedures. An exam- 
ple of such a command is basename which removes a specified suffix from a string. For exam- 
ple, 


basename main.c .c 


will print the string main. Its use is illustrated by the following fragment from a cc com- 
mand. 


An Introduction to the UNIX Shell 4-19 
case $A in 
*.c) B=‘basename $A .c 


esac 


that sets B to the part of $A with the suffix .e stripped. 


Here are some composite examples. 


for iin Ist; do... 

The variable i is set to the names of files in time order, most recent first. 
set date; echo $6 $2 $3, $4 

will print, e.g., 1977 Nov 1, 23:59:59 


3.4 Evaluation and quoting 


The shell is a macro processor that provides parameter substitution, command substitution 
and file name generation for the arguments to commands. This section discusses the order in 
which these evaluations occur and the effects of the various quoting mechanisms. 


Commands are parsed initially according to the grammar given in appendix A. Before a com- 
mand is executed the following substitutions occur. 


parameter substitution, e.g. $user 
command substitution, e.g. ‘pwd 
Only one evaluation occurs so that if, for example, the value of the variable X is 
the string $y then 
echo $X 


will echo $y. 
blank interpretation 


Following the above substitutions the resulting characters are broken into non- 
blank words (blank interpretation). For this purpose ‘blanks’ are the characters of 
the string $IFS. By default, this string consists of blank, tab and newline. The 
null string is not regarded as a word unless it is quoted. For example, 


echo ~ 
will pass on the null string as the first argument to echo, whereas 
echo $null 
will call echo with no arguments if the variable null is not set or set to the null 
string. 
file name generation 


Each word is then scanned for the file pattern characters *, ? and [...] and an 
alphabetical list of file names is generated to replace the word. Each such file 
name is a separate argument. 


The evaluations just described also occur in the list of words associated with a for loop. Only 
substitution occurs in the word used for a case branch. 


As well as the quoting mechanisms described earlier using \ and ’...’ a third quoting mechan- 
ism is provided using double quotes. Within double quotes parameter and command substitu- 
tion occurs but file name generation and the interpretation of blanks does not. The following 
characters have a special meaning within double quotes and may be quoted using \. 
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$ parameter substitution 
. command substitution 
ends the quoted string 
\ quotes the special characters $ *” \ 


” 


For example, 
echo ”$x” 

will pass the value of the variable x as a single argument to echo. Similarly, 
echo ”$*” 

will pass the positional parameters as a single argument and is equivalent to 
echo "$1 $2...” 

The notation $@ is the same as $* except when it is quoted. 
echo ”$@” 

will pass the positional parameters, unevaluated, to echo and is equivalent to 


echo ”$1” ”$2” ... 


The following table gives, for each quoting mechanism, the shell metacharacters that are 
evaluated. 


metacharacter 
\ $ * nN ” , 
7 n n n n n t 
‘ y n n t n n 
"yy y n y t on 
terminator 
interpreted 


n not interpreted 


Figure 2. Quoting mechanisms 


In cases where more than one evaluation of a string is required the built-in command eval 
may be used. For example, if the variable X has the value $y, and if y has the value pqr then 


eval echo $X 


will echo the string pqr. 


In general the eval command evaluates its arguments (as do all commands) and treats the 
result as input to the shell. The input is read and the resulting command(s) executed. For 
example, 


wg= eval wholgrep’ 
$wg fred 
is equivalent to 
wholgrep fred 


In this example, eval is required since there is no interpretation of metacharacters, such as | , 
following substitution. 
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3.5 Error handling 


The treatment of errors detected by the shell depends on the type of error and on whether the 
shell is being used interactively. An interactive shell is one whose input and output are con- 
nected to a terminal (as determined by gtty (2)). A shell invoked with the -i flag is also 
interactive. 


Execution of a command (see also 3.7) may fail for any of the following reasons. 


° Input output redirection may fail. For example, if a file does not exist or cannot be 
created. 

° The command itself does not exist or cannot be executed. 

° The command terminates abnormally, for example, with a "bus error” or “memory 


fault”. See Figure 2 below for a complete list of UNIX signals. 
e The command terminates normally but returns a non-zero exit status. 


In all of these cases the shell will go on to execute the next command. Except for the last case 
an error message will be printed by the shell. All remaining errors cause the shell to exit from 
a command procedure. An interactive shell will return to read another command from the 
terminal. Such errors include the following. 


° Syntax errors. e.g., if... then ... done 


° A signal such as interrupt. The shell waits for the current command, if any, to finish 
execution and then either exits or returns to the terminal. 


° Failure of any of the built-in commands such as cd. 
The shell flag -e causes the shell to terminate if any error is detected. 


1 hangup 
2 interrupt 
3* = quit 


4* illegal instruction 

5* trace trap 

6* IOT instruction 

7* EMT instruction 

8* floating point exception 

9 kill (cannot be caught or ignored) 
10* bus error 

11* segmentation violation 

12* bad argument to system call 

13. _—-write on a pipe with no one to read it 
14. alarm clock 

15 software termination (from kill (1)) 


Figure 3. UNIX signals 


Those signals marked with an asterisk produce a core dump if not caught. However, the shell 
itself ignores quit which is the only external signal that can cause a dump. The signals in this 
list of potential interest to shell programs are 1, 2, 3, 14 and 15. 


3.6 Fault handling 


Shell procedures normally terminate when an interrupt is received from the terminal. The 
trap command is used if some cleaning up is required, such as removing temporary files. For 
example, 


trap ‘rm /tmp/ps$$; exit’ 2 


sets a trap for signal 2 (terminal interrupt), and if this signal is received will execute the 
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commands 
rm /tmp/ps$$; exit 


exit is another built-in command that terminates execution of a shell procedure. The exit is 
required; otherwise, after the trap has been taken, the shell will resume executing the pro- 
cedure at the place where it was interrupted. 


UNIX signals can be handled in one of three ways. They can be ignored, in which case the 
signal is never sent to the process. They can be caught, in which case the process must decide 
what action to take when the signal is received. Lastly, they can be left to cause termination 
of the process without it having to take any further action. If a signal is being ignored on 
entry to the shell procedure, for example, by invoking it in the background (see 3.7) then trap 
commands (and the signal) are ignored. 


The use of trap is illustrated by this modified version of the touch command (Figure 4). The 
cleanup action is to remove the file junk$$. 


flag= 
trap ‘rm -f junk$$; exit’ 1 2 3 15 
for i 
do case $i in 
—c) flag=N ;; 
*)if test £ $i 
then In $i junk$$; rm junk$$ 
elif test $flag 
then echo file \’$i\ does not exist 
else >$i 
fi 
esac 
done 


Figure 4. The touch command 


The trap command appears before the creation of the temporary file; otherwise it would be 
possible for the process to die without removing the file. 


Since there is no signal 0 in UNIX it is used by the shell to indicate the commands to be exe- 
cuted on exit from the shell procedure. 


A procedure may, itself, elect to ignore signals by specifying the null string as the argument to 
trap. The following fragment is taken from the nohup command. 


trap ~ 12315 


which causes hangup, interrupt, quit and kill to be ignored both by the procedure and by 
invoked commands. 


Traps may be reset by saying 
trap 2 3 


which resets the traps for signals 2 and 3 to their default values. A list of the current values 
of traps may be obtained by writing 


trap 


The procedure scan (Figure 5) is an example of the use of trap where there is no exit in the 
trap command. scan takes each directory in the current directory, prompts with its name, 
and then executes commands typed at the terminal until an end of file or an interrupt is 
received. Interrupts are ignored while executing the requested commands but cause termina- 
tion when scan is waiting for input. 
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d=‘pwd 
for i in * 
do if test -d $d/$i 
then cd $d/$i 
while echo ”$i:” 


trap exit 2 
read x 
do trap : 2; eval $x; done 
fi 
done 


Figure 5. The scan command 


read x is a built-in command that reads one line from the standard input and places the result 
in the variable x. It returns a non-zero exit status if either an end-of-file is read or an inter- 
rupt is received. 


3.7 Command execution 


To run a command (other than a built-in) the shell first creates a new process using the sys- 
tem call fork. The execution environment for the command includes input, output and the 
states of signals, and is established in the child process before the command is executed. The 
built-in command exec is used in the rare cases when no fork is required and simply replaces 
the shell with a new command. For example, a simple version of the nohup command looks 
like 

trap “12315 

exec $+ 
The trap turns off the signals specified so that they are ignored by subsequently created com- 
mands and exec replaces the shell by the command specified. 


Most forms of input output redirection have already been described. In the following word is 
only subject to parameter and command substitution. No file name generation or blank 
interpretation takes place so that, for example, 


echo ... >*.C 
will write its output into a file whose name is *.c. Input output specifications are evaluated 
left to right as they appear in the command. 
> word The standard output (file descriptor 1) is sent to the file word which is created if 
it does not already exist. 
>> word The standard output is sent to file word. If the file exists then output is 
appended (by seeking to the end); otherwise the file is created. 
< word The standard input (file descriptor 0) is taken from the file word. 


<< word The standard input is taken from the lines of shell input that follow up to but 
not including a line consisting only of word: If word is quoted then no interpre- 
tation of the document occurs. If word is not quoted then parameter and com- 
mand substitution occur and \ is used to quote the characters \ $ * and the first 
character of word. In the latter case\newline is ignored (c.f. quoted strings).: 


>& digit The file descriptor digit is duplicated using the system call dup (2) and the 
result is used as the standard output. 


<& digit The standard input is duplicated from file descriptor digit. 
<&~- The standard input is closed. 
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>&- The standard output is closed. 
Any of the above may be preceded by a digit in which case the file descriptor created is that 
specified by the digit instead of the default 0 or 1. For example, 
... 2>file 
runs a command with message output (file descriptor 2) directed to file. 


we 2>&1 


runs a command with its standard output and message output merged. (Strictly speaking file 
descriptor 2 is created by duplicating file descriptor 1 but the effect is usually to merge the 
two streams.) 


The environment for a command run in the background such as 
list *.c | lpr & 


is modified in two ways. Firstly, the default standard input for such a command is the empty 
file /dev/null. This prevents two processes (the shell and the command), which are running 
in parallel, from trying to read the same input. Chaos would ensue if this were not the case. 
For example, 


ed file & 


would allow both the editor and the shell to read from the same input at the same time. 


The other modification to the environment of a background command is to turn off the QUIT 
and INTERRUPT signals so that they are ignored by the command. This allows these signals 
to be used at the terminal without causing background commands to terminate. For this rea- 
son the UNIX convention for a signal is that if it is set to 1 (ignored) then it is never changed 
even for a short time. Note that the shell command trap has no effect for an ignored signal. 


3.8 Invoking the shell 
The following flags are interpreted by the shell when it is invoked. If the first character of 
argument zero is a minus, then commands are read from the file .profile. 
-e string 
If the -c flag is present then commands are read from string. 
-s If the -s flag is present or if no arguments remain then commands are read from the 
standard input. Shell output is written to file descriptor 2. 


-i=s:If the -i flag is present or if the shell input and output are attached to a terminal (as 
told by gtty) then this shell is interactive. In this case TERMINATE is ignored (so that 
kill O does not kill an interactive shell) and INTERRUPT is caught and ignored (so 
that wait is interruptable). In all cases QUIT is ignored by the shell. 


Acknowledgements 


The design of the shell is based in part on the original UNIX shell? and the PWB/UNIX 
shell,* some features having been taken from both. Similarities also exist with the command 
interpreters of the Cambridge Multiple Access System® and of CTSS.® 


I would like to thank Dennis Ritchie and John Mashey for many discussions during the design 
of the shell. I am also grateful to the members of the Computing Science Research Center 
and to Joe Maranzano for their comments on drafts of this document. 


References 


1. B. W. Kernighan, UNIX for Beginners, 1978. 


An Introduction to the UNIX Shell 4-25 


K. Thompson and D. M. Ritchie, UNIX Programmer’s Manual, Bell Laboratories, 1978. 
Seventh Edition. 

K. Thompson, “The UNIX Command Language,” in Structured Programming—Infotech 
State of the Art Report, pp. 375-384, Infotech International Ltd., Nicholson House, 
Maidenhead, Berkshire, England, March 1975. 


J. R. Mashey, PWB/UNIX Shell Tutorial, September 30, 1977. 


D. F. Hartley (Ed.), The Cambridge Multiple Access System — Users Reference 
Manual, University Mathematical Laboratory, Cambridge, England, 1968. 

P. A. Crisman (Ed.), The Compatible Time-Sharing System, M.1.T. Press, Cambridge, 
Mass., 1965. 


4-26 An Introduction to the UNIX Shell 


Appendix A - Grammar 


item: 


word 
input-out put 
name = value 


simple-command: item 


command: 


pipeline: 


andor: 


simple-command item 


simple-command 

( command.-list ) 

{ command-list } 

for name do command-list done 

for name in word ... do command-list done 
while command-list do command-list done 
until command-list do command-list done 
case word in case-part ... esac 

if command-list then command-list else-part fi 


command 
pipeline | command 


pipeline 
andor && pipeline 
andor || pipeline 


commana-list: andor 


command-list ; 
command-list & 
command-list ; andor 
command-list & andor 


input-output: > file 


file: 


case-part: 


pattern: 


else-part: 


empty: 
word: 
name: 


digit: 


< file 
>> word 
<< word 


word 
& digit 
&- 


pattern ) command-list 3; 


word 
pattern | word 


elif command-list then command-list else-part 
else command-list 
empty 


a sequence of non-blank characters 
a sequence of letters, digits or underscores starting with a letter 


0123456789 
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Appendix B - Meta-characters and Reserved Words 
a) syntactic 
| pipe symbol 
&& ‘andf’ symbol 


|| ‘orf? symbol 

: command separator 

. case delimiter 

& background commands 


() command grouping 

< input redirection 

<< input from a here document 
> output creation 

>> output append 


b) patterns 
* match any character(s) including none 
v4 match any single character 
[..] match any of the enclosed characters 


c) substitution 
${...} substitute shell variable 


“ “ 


substitute command output 


d) quoting 
\ quote the next character 


a rd 


quote the enclosed characters except for ’ 


3” 699 
eco 


quote the enclosed characters except for $* \” 


e) reserved words 


if then else elif fi 
case in esac 
for while until do done 


to 
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Introduction 


A shell is a command language interpreter. Csh is the name of one particular command 
interpreter on UNIX. The primary purpose of csh is to translate command lines typed at a 
terminal into system actions, such as invocation of other programs. Csh is a user program 
just like any you might write. Hopefully, csh will be a very useful program for you in 
interacting with the UNIX system. 


In addition to this document, you will want to refer to a copy of the UNIX programmer’s 
manual. The csh documentation in the manual provides a full description of all features of 
the shell and is a final reference for questions about the shell. 


Many words in this document are shown in italics. These are important words; names of 
commands, and words which have special meaning in discussing the shell and UNIX. Many of 
the words are defined in a glossary at the end of this document. If you don’t know what is 
meant by a word, you should look for it in the glossary. 
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1. Terminal usage of the shell 


1.1. The basic notion of commands 


A shell in UNIX acts mostly as a medium through which other programs are invoked. 
While it has a set of builtin functions which it performs directly, most commands cause exe- 
cution of programs that are, in fact, external to the shell. The shell is thus distinguished from 
the command interpreters of other systems both by the fact that it is just a user program, and 
by the fact that it is used almost exclusively as a mechanism for invoking other programs. 


Commands in the UNIX system consist of a list of strings or words interpreted as a com- 
mand name followed by arguments. Thus the command 


mail bill 


consists of two words. The first word mail names the command to be executed, in this case 
the mail program which sends messages to other users. The shell uses the name of the com- 
mand in attempting to execute it for you. It will look in a number of directories for a file 
with the name mail which is expected to contain the mail program. 


The rest of the words of the command are given as arguments to the command itself 
when it is executed. In this case we specified also the argument bill which is interpreted by 
the mail program to be the name of a user to whom mail is to be sent. In normal terminal 
usage we might use the mail command as follows. 


% mail bill — 
I have a question about the csh documentation. 
My document seems to be missing page 5. 
Does a page five exist? 

Bill 
EOT 
% 


Here we typed a message to send to bill and ended this message with a tD which sent 
an end-of-file to the mail program. (Here and throughout this document, the notation “tx” is 
to be read “control-x” and represents the striking of the x key while the control key is held 
down.) The mail program then echoed the characters ‘EOT”’ and transmitted our message. 
The characters ‘% ” were printed before and after the mail command by the shell to indicate 
that input was needed. 


After typing the ‘% ” prompt the shell was reading command input from our terminal. 
We typed a complete command ‘mail bill’. The shell then executed the mail program with 
argument bill and went dormant waiting for it to complete. The mail program then read 
input from our terminal until we signalled an end-of-file via typing a ‘+D after which the shell 
noticed that mail had completed and signaled us that it was ready to read from the terminal 
again by printing another ‘% ’ prompt. 


This is the essential pattern of all interaction with UNIX through the shell. A complete 
command is typed at the terminal, the shell executes the command and when this execution 
completes, it prompts for a new command. If you run the editor for an hour, the shell will 
patiently wait for you to finish editing and obediently prompt you again whenever you finish 
editing. 

An example of a useful command you can execute now is the tset command, which sets 
the default erase and kill characters on your terminal — the erase character erases the last 
character you typed and the kill character erases the entire line you have entered so far. By 
default, the erase character is ‘#’ and the kill character is ‘@’. Most people who use CRT 
displays prefer to use the backspace (tH) character as their erase character since it is then 
easier to see what you have typed so far. You can make this be true by typing : 
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tset —e 


which tells the program tset to set the erase character, and its default setting for this charac- 
ter is a backspace. 


1.2. Flag arguments 


A useful notion in UNIX is that of a flag argument. While many arguments to commands 
specify file names or user names some arguments rather specify an optional capability of the 
command which you wish to invoke. By convention, such arguments begin with the character 
‘—’ (hyphen). Thus the command 


Is 


will produce a list of the files in the current working directory. The option —s is the size 
option, and 


ls —s 


causes Is to also give, for each file the size of the file in blocks of 512 characters. The manual 
section for each command in the UNIX reference manual gives the available options for each 
command. The /s command has a large number of useful and interesting options. Most other 
commands have either no options or only one or two options. It is hard to remember options 
of commands which are not used very frequently, so most UNIX utilities perform only one or 
two functions rather than having a large number of hard to remember options. 


1.3. Output to files 


Commands that normally read input or write output on the terminal can also be exe- 
cuted with this input and/or output done to a file. 


Thus suppose we wish to save the current date in a file called ‘now’. The command 
date 


will print the current date on our terminal. This is because our terminal is the default stan- 
dard output for the date command and the date command prints the date on its standard 
output. The shell lets us redirect the standard output of a command through a notation 
using the metacharacter ‘>’ and the name of the file where output is to be placed. Thus the 
command 


date > now 


runs the date command such that its standard output is the file ‘now’ rather than the termi- 
nal. Thus this command places the current date and time into the file ‘now’. It is important 
to know that the date command was unaware that its output was going to a file rather than to 
the terminal. The shell performed this redirection before the command began executing. 


One other thing to note here is that the file ‘now’ need not have existed before the date 
command was executed; the shell would have created the file if it did not exist. And if the file 
did exist? If it had existed previously these previous contents would have been discarded! A 
shell option noclobber exists to prevent this from happening accidentally; it is discussed in 
section 2.2. 


The system normally keeps files which you create with ‘>’ and all other files. Thus the 
default is for files to be permanent. If you wish to create a file which will be removed 
automatically, you can begin its name with a ‘#’ character, this ‘scratch’ character denotes the 
fact that the file will be a scratch file.* The system will remove such files after a couple of 


*Note that if your erase character is a ‘#’, you will have to precede the ‘#’ with a ‘Xx. The fact that the ‘#’ 
character is the old (pre-ckT) standard erase character means that it seldom appears in a file name, and al- 
lows this convention to be used for scratch files. If you are using a CRT, your erase character should be a 
tH, as we demonstrated in section 1.1 how this could be set up. 
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days, or sooner if file space becomes very tight. Thus, in running the date command above, 
we don’t really want to save the output forever, so we would more likely do 


date > #Hnow 


1.4. Metacharacters in the shell 


The shell has a large number of special characters (like ‘>’) which indicate special func- 
tions. We say that these notations have syntactic and semantic meaning to the shell. In gen- 
eral, most characters which are neither letters nor digits have special meaning to the shell. 
We shall shortly learn a means of quotation which allows us to use metacharacters without 
the shell treating them in any special way. 


Metacharacters normally have effect only when the shell is reading our input. We need 
not worry about placing shell metacharacters in a letter we are sending via mail, or when we 
are typing in text or data to some other program. Note that the shell is only reading input 
when it has prompted with ‘% ’. 


1.5. Input from files; pipelines 


We learned above how to redirect the standard output of a command to a file. It is also 
possible to redirect the standard input of a command from a file. This is not often necessary 
since most commands will read from a file whose name is given as an argument. We can give 
the command 


sort < data 


to run the sort command with standard input, where the command normally reads its input, 
from the file ‘data’. We would more likely say 


sort data 


letting the sort command open the file ‘data’ for input itself since this is less to type. 
We should note that if we just typed 


sort 


then the sort program would sort lines from its standard input. Since we did not redirect the 
standard input, it would sort lines as we typed them on the terminal until we typed a TD to 
indicate an end-of-file. 


A most useful capability is the ability to combine the standard output of one command 
with the standard input of another, i.e. to run the commands in a sequence known as a pipe- 
line. For instance the command 


ls —s 


normally produces a list of the files in our directory with the size of each in blocks of 512 
characters. If we are interested in learning which of our files is largest we may wish to have 
this sorted by size rather than by name, which is the default way in which /s sorts. We could 
look at the many options of ls to see if there was an option to do this but would eventually 
discover that there is not. Instead we can use a couple of simple options of the sort com- 
mand, combining it with /s to get what we want. 


The —n option of sort specifies a numeric sort rather than an alphabetic sort. Thus 
ls —s |sort —n 


specifies that the output of the Js command run with the option —s is to be piped to the com- 
mand sort ‘run with the numeric sort option. This would give us a sorted list of our files by 
size, but with the smallest first. We could then use the —r reverse sort option and the head 
command in combination with the previous command doing 
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ls —s | sort —n —r | head —5 


Here we have taken a list of our files sorted alphabetically, each with the size in blocks. We 
have run this to the standard input of the sort command asking it to sort numerically in 
reverse order (largest first). This output has then been run into the command head which 
gives us the first few lines. In this case we have asked head for the first 5 lines. Thus this 
command gives us the names and sizes of our 5 largest files. 


The notation introduced above is called the pipe mechanism. Commands separated by 
characters are connected together by the shell and the standard output of each is run into 
the standard input of the next. The leftmost command in a pipeline will normally take its 
standard input from the terminal and the rightmost will place its standard output on the ter- 
minal. Other examples of pipelines will be given later when we discuss the history mechan- 
ism; one important use of pipes which is illustrated there is in the routing of information to 
the line printer. 


«| 


1.6. Filenames 


Many commands to be executed will need the names of files as arguments. UNIX path- 
names consist of a number of components separated by ‘/’.. Each component except the last 
names a directory in which the next component resides, in effect specifying the path of direc- 
tories to follow to reach the file. Thus the pathname 


/etc/motd 


specifies a file in the directory ‘etc’ which is a subdirectory of the root directory ‘/’. Within 
this directory the file named is ‘motd’ which stands for ‘message of the day’. A pathname 
that begins with a slash is said to be an absolute pathname since it is specified from the abso- 
lute top of the entire directory hierarchy of the system (the root). Pathnames which do not 
begin with ‘/’ are interpreted as starting in the current working directory, which is, by default, 
your home directory and can be changed dynamically by the cd change directory command. 
Such pathnames are said to be relative to the working directory since they are found by start- 
ing in the working directory and descending to lower levels of directories for each component 
of the pathname. If the pathname contains no slashes at all then the file is contained in the 
working directory itself and the pathname is merely the name of the file in this directory. 
Absolute pathnames have no relation to the working directory. 


Most filenames consist of a number of alphanumeric characters and ‘.’s (periods). In 
fact, ‘all printing characters except ‘/’ (slash) may appear in filenames. It is inconvenient to 
have most non-alphabetic characters in filenames because many of these have special meaning 
to the shell. The character ‘.’ (period) is not a shell-metacharacter and is often used to 
separate the extension of a file name from the base of the name. Thus 


prog.c prog.o prog.errs prog.output 


are four related files. They share a base portion of a name (a base portion being that part of 
the name that is left when a trailing ‘.’ and following characters which are not ‘.’ are stripped 
off). The file ‘prog.c’ might be the source for a C program, the file ‘prog.o’ the corresponding 
object file, the file ‘prog.errs’ the errors resulting from a compilation of the program and the 
file ‘prog.output’ the output of a run of the program. 


If we wished to refer to all four of these files in a command, we could use the notation 
prog.* 


This word is expanded by the shell, before the command to which it is an argument is exe- 
cuted, into a list of names which begin with ‘prog.’. The character ‘*’ here matches any 
sequence (including the empty sequence) of characters in a file name. The names which 
match are alphabetically sorted and placed in the argument list of the command. Thus the 
command 
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echo prog.* 
will echo the names 
prog.c prog.errs prog.o prog.output 


Note that the names are in sorted order here, and a different order than we listed them above. 
The echo command receives four words as arguments, even though we only typed one word as 
as argument directly. The four words were generated by filename expansion of the one input 
word. 


Other notations for filename expansion are also available. The character ‘?’ matches 
any single character in a filename. Thus 


echo ? ?? ??? 
will echo a line of filenames; first those with one character names, then those with two charac- 


ter names, and finally those with three character names. The names of each length will be 
independently sorted. 


Another mechanism consists of a sequence of characters between ‘[’ and ‘]’. This 
metasequence matches any single character from the enclosed set. Thus 


prog.[co] 
will match 
prog.c prog.o 


in the example above. We can also place two characters around a ‘— 
denote a range. Thus 


chap.[1—5] 


> 


in this notation to 


might match files 
chap.1 chap.2 chap.3 chap.4 chap.5 
if they existed. This is shorthand for 
chap.[12345] 


and otherwise equivalent. 


An important point to note is that if a list of argument words to a command (an argu- 
ment list) contains filename expansion syntax, and if this filename expansion syntax fails to 
match any existing file names, then the shell considers this to be an error and prints a diag- 
nostic 


No match. 


and does not execute the command. 


‘9 


Another very important point is that files with the character ‘.’ at the beginning are 
treated specially. Neither ‘*’ or ‘?’ or the ‘[’ ‘]? mechanism will match it. This prevents 
accidental matching of the filenames ‘.’ and ‘..’ in the working directory which have special 
meaning to the system, as well as other files such as .cshre which are not normally visible. We 
will discuss the special role of the file .cshre later. 


Another filename expansion mechanism gives access to the pathname of the home direc- 
tory of other users. This notation consists of the character “’ (tilde) followed by another 
users’ login name. For instance the word bill would map to the pathname ‘/usr/bill’ if the 
home directory for ‘bill’ was ‘/usr/bill’. Since, on large systems, users may have login direc- 
tories scattered over many different disk volumes with different prefix directory names, this 
notation provides a reliable way of accessing the files of other users. 
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‘~? 


A special case of this notation consists of a alone, e.g. ““/mbox’. This notation is 
expanded by the shell into the file ‘mbox’ in your home directory, i.e. into ‘/usr/bill/mbox’ for 
me on Ernie Co-vax, the UCB Computer Science Department VAX machine, where this docu- 
ment was prepared. This can be very useful if you have used cd to change to another direc- 
tory and have found a file you wish to copy using cp. If I give the command 


cp thatfile ~ 
the shell will expand this command to 
cp thatfile /usr/bill 


since my home directory is /usr/bill. 


There also exists a mechanism using the characters ‘{’ and ‘}’ for abbreviating a set of 
words which have common parts but cannot be abbreviated by the above mechanisms because 
they are not files, are the names of files which do not yet exist, are not thus conveniently 
described. This mechanism will be described much later, in section 4.2, as it is used less fre- 
quently. 


1.7. Quotation 


We have already seen a number of metacharacters used by the shell. These metacharac- 
ters pose a problem in that we cannot use them directly as parts of words. Thus the com- 
mand 


echo * 


will not echo the character ‘*’. It will either echo an sorted list of filenames in the current 
working directory, or print the message ‘No match’ if there are no files in the working direc- 
tory. 

The recommended mechanism for placing characters which are neither numbers, digits, 
‘?, ‘’ or ‘—’ in an argument word to a command is to enclose it with single quotation charac- 
ters “”’, i.e. 

echo **’ 

There is one special character ‘!’ which is used by the history mechanism of the shell and 
which cannot be escaped by placing it within “’ characters. It and the character “’ itself can 
be preceded by a single ‘\ to prevent their special meaning. Thus 


echo \'\! 
prints 
‘! 
These two mechanisms suffice to place any printing character into a word which is an argu- 
ment to a shell command. They can be combined, as in 
echo \’*’ 
which prints 
a 


since the first ‘\ escaped the first “’ and the ‘*’ was enclosed between ‘’ characters. 


1.8. Terminating commands 


When you are executing a command and the shell is waiting for it to complete there are 
several ways to force it to stop. For instance if you type the command 


cat /etc/passwd 


the system will print a copy of a list of all users of the system on your terminal. This is likely 
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‘to continue for several minutes unless you stop it. You can send an INTERRUPT signal to the 
cat command by typing the DEL or RUBOUT key on your terminal.* Since cat does not take 
any precautions to avoid or otherwise handle this signal the INTERRUPT will cause it to ter- 
minate. The shell notices that cat has terminated and prompts you again with ‘% ’. If you 
hit INTERRUPT again, the shell will just repeat its prompt since it handles INTERRUPT signals 
and chooses to continue to execute commands rather than terminating like cat did, which 
would have the effect of logging you out. 


Another way in which many programs terminate is when they get an end-of-file from 
their standard input. Thus the mail program in the first example above was terminated when 
we typed a tD which generates an end-of-file from the standard input. The shell also ter- 
minates when it gets an end-of-file printing ‘logout’; UNIX then logs you off the system. Since 
this means that typing too many tD’s can accidentally log us off, the shell has a mechanism 
for preventing this. This ignoreeof option will be discussed in section 2.2. 


If a command has its standard input redirected from a file, then it will normally ter- 
minate when it reaches the end of this file. Thus if we execute 


mail bill < prepared.text 


the mail command will terminate without our typing a TD. This is because it read to the 
end-of-file of our file ‘prepared.text’ in which we placed a message for ‘bill’ with an editor pro- 
gram. We could also have done 


cat prepared.text | mail bill 


since the cat command would then have written the text through the pipe to the standard 
input of the mail command. When the cat command completed it would have terminated, 
closing down the pipeline and the mail command would have received an end-of-file from it 
and terminated. Using a pipe here is more complicated than redirecting input so we would 
more likely use the first form. These commands could also have been stopped by sending an 
INTERRUPT. 


Another possibility for stopping a command is to suspend its execution temporarily, with 
the possibility of continuing execution later. This is done by sending a STOP signal via typing 
atZ. This signal causes all commands running on the terminal (usually one but more if a 
pipeline is executing) to become suspended. The shell notices that the command(s) have been 
suspended, types ‘Stopped’ and then prompts for a new command. The previously executing 
command has been suspended, but otherwise unaffected by the STOP signal. Any other com- 
mands can be executed while the original command remains suspended. The suspended com- 
mand can be continued using the fg command with no arguments. The shell will then retype 
the command to remind you which command is being continued, and cause the command to 
resume execution. Unless any input files in use by the suspended command have been 
changed in the meantime, the suspension has no effect whatsoever on the execution of the 
command. This feature can be very useful during editing, when you need to look at another 
file before continuing. An example of command suspension follows. 


*Many users use stty(1) to change the interrupt character to TC. 
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% mail harold 

rae just copied a big file into my directory and its name is 
Z 

Stopped 

% |s 

funnyfile 

prog.c 

prog.o 

% jobs 

[1] + Stopped mail harold 
% fg 

mail harold 

funnyfile. Do you know who did it? 

EOT 

% 


In this example someone was sending a message to Harold and forgot the name of the file he 
wanted to mention. The mail command was suspended by typing tZ. When the shell noticed 
that the mail program was suspended, it typed ‘Stopped’ and prompted for a new command. 
Then the /s command was typed to find out the name of the file. The jobs command was run 
to find out which command was suspended. At this time the fg command was typed to con- 
tinue execution of the mail program. Input to the mail program was then continued and 
ended with a tD which indicated the end of the message at which time the mail program 
typed EOT. The jobs command will show which commands are suspended. The tZ should 
only be typed at the beginning of a line since everything typed on the current line is discarded 
when a signal is sent from the keyboard. This also happens on INTERRUPT, and QUIT signals. 
More information on suspending jobs and controlling them is given in section 2.6. 


If you write or run programs which are not fully debugged then it may be necessary to 
stop them somewhat ungracefully. This can be done by sending them a QUIT signal, sent by 
typing at\. This will usually provoke the shell to produce a message like: 


Quit (Core dumped) 


indicating that a file ‘core’ has been created containing information about the program ‘a.out’s 
state when it terminated due to the QUIT signal. You can examine this file yourself, or for- 
ward information to the maintainer of the program telling him/her where the core file is. 


If you run background commands (as explained in section 2.6) then these commands will 
ignore INTERRUPT and QUIT signals at the terminal. To stop them you must use the kill com- 
mand. See section 2.6 for an example. 


If you want to examine the output of a command without having it move off the screen 
as the output of the 
cat /etc/passwd 


command will, you can use the command 


more /etc/passwd 


‘ 


The more program pauses after each complete screenful and types ‘-—More——’ at which 
point you can hit a space to get another screenful, a return to get another line, or a ‘q’ to end 
the more program. You can also use more as a filter, i.e. 


cat /etc/passwd | more 


works just like the more simple more command above. 


For stopping output of commands not involving more you can use the tS key to stop the 
typeout. The typeout will resume when you hit TQ or any other key, but TQ is normally used 
because it only restarts the output and does not become input to the program which is 
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running. This works well on low-speed terminals, but at 9600 baud it is hard to type }S and 
*Q fast enough to paginate the output nicely, and a program like more is usually used. 


An additional possibility is to use the TO flush output character; when this character is 
typed, all output from the current command is thrown away (quickly) until the next input 
read occurs or until the next shell prompt. This can be used to allow a command to complete 
without having to suffer through the output on a slow terminal; tO is a toggle, so flushing can 
be turned off by typing tO again while output is being flushed. 


1.9. What now? 


We have so far seen a number of mechanisms of the shell and learned a lot about the 
way in which it operates. The remaining sections will go yet further into the internals of the 
shell, but you will surely want to try using the shell before you go any further. To try it you 
can log in to UNIX and type the following command to the system: 


chsh myname /bin/csh 


Here ‘myname’ should be replaced by the name you typed to the system prompt of ‘login:’ to 
get onto the system. Thus I would use ‘chsh bill /bin/csh’. You only have to do this 
once; it takes effect at next login. You are now ready to try using csh. 


Before you do the ‘chsh’ command, the shell you are using when you log into the system 
is ‘/bin/sh’. In fact, much of the above discussion is applicable to ‘/bin/sh’. The next section 
will introduce many features particular to csh so you should change your shell to csh before 
you begin reading it. 
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2. Details on the shell for terminal users 


2.1. Shell startup and termination 


When you login, the shell is started by the system in your home directory and begins by 
reading commands from a file .cshre in this directory. All shells which you may start during 
your terminal session will read from this file. We will later see what kinds of commands are 
usefully placed there. For now we need not have this file and the shell does not complain 
about its absence. 


A login shell, executed after you login to the system, will, after it reads commands from 
.cshrc, read commands from a file .login also in your home directory. This file contains com- 
mands which you wish to do each time you login to the UNIX system. My .login file looks 
something like: 


set ignoreeof 
set mail=(/usr/spool/mail/bill) 
echo ”${prompt}users” ; users 
alias ts \ 
‘set noglob ; eval ‘tset —s —m dialup:cl00rv4pna —m plugboard:?hp2621nl *”; 
ts; stty intr¢ C kill tU crt 
set time=15 history=10 
msgs —f 
if (—e $mail) then 
echo ”${prompt} mail” 
mail 
endif 


This file contains several commands to be executed by UNIX each time I login. The first 
is a set command which is interpreted directly by the shell. It sets the shell variable 
ignoreeof which causes the shell to not log me off if I hit D. Rather, I use the logout com- 
mand to log off of the system. By setting the mail variable, I ask the shell to watch for 
incoming mail to me. Every 5 minutes the shell looks for this file and tells me if more mail 
has arrived there. An alternative to this is to put the command 


biff y 


in place of this set; this will cause me to be notified immediately when mail arrives, and to be 
shown the first few lines of the new message. | 


Next I set the shell variable ‘time’ to ‘15’ causing the shell to automatically print out 
statistics lines for commands which execute for at least 15 seconds of CPU time. The variable 
‘history’ is set to 10 indicating that I want the shell to remember the last 10 commands I type 
in its history list , (described later). 


I create an alias “ts” which executes a tset (1) command setting up the modes of the ter- 
minal. The parameters to tset indicate the kinds of terminal which I usually use when not on 
a hardwired port. I then execute “ts” and also use the stty command to change the interrupt 
character tot C and the line kill character tot U. 


I then run the ‘msgs’ program, which provides me with any system messages which I 
have not seen before; the ‘—f’ option here prevents it from telling me anything if there are no 
new messages. Finally, if my mailbox file exists, then I run the ‘mail’ program to process my 
mail. 

When the ‘mail’ and ‘msgs’ programs finish, the shell will finish processing my .login file 
and begin reading commands from the terminal, prompting for each with ‘% ’. When I log off 
(by giving the logout command) the shell will print ‘logout’ and execute commands from the 
file ‘.logout’ if it exists in my home directory. After that the shell will terminate and UNIX will 
log me off the system. If the system is not going down, I will receive a new login message. In 
any case, after the ‘logout’ message the shell is committed to terminating and will take no 
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further input from my terminal. 


2.2. Shell variables 


The shell maintains a set of variables. We saw above the variables history and time 
which had values ‘10’ and ‘15’. In fact, each shell variable has as value an array of zero or 
more strings. Shell variables may be assigned values by the set command. It has several 
forms, the most useful of which was given above and is 


set name=value 


Shell variables may be used to store values which are to be used in commands later 
through a substitution mechanism. The shell variables most commonly referenced are, how- 
ever, those which the shell itself refers to. By changing the values of these variables one can 
directly affect the behavior of the shell. 


One of the most important variables is the variable path. This variable contains a 
sequence of directory names where the shell searches for commands. The set command with 
no arguments shows the value of all variables currently defined (we usually say set) in the 
shell. The default value for path will be shown by set to be 


% set 

argv 0) 

cwd /usr/bill 
home /usr/bill 
path (. /usr/ucb /bin /usr/bin) 
prompt % 

shell /bin/csh 
status 0 

term c100rv4pna 
user bill 

% 


This output indicates that the variable path points to the current directory ‘.’ and then 
‘/usr/uch’, ‘/bin’ and ‘/usr/bin’. Commands which you may write might be in ‘.’ (usually one of 
your directories). Commands developed at Berkeley, live in ‘/usr/ucb’ while commands 
developed at Bell Laboratories live in ‘/bin’ and ‘/usr/bin’. 


A number of locally developed programs on the system live in the directory ‘/usr/local’. 
If we wish that all shells which we invoke to have access to these new programs we can place 
the command 


set path=(. /usr/ucb /bin /usr/bin /usr/local) 


in our file .cshre in our home directory. Try doing this and then logging out and back in and 


do 
set 


again to see that the value assigned to path has changed. 


One thing you should be aware of is that the shell examines each directory which you 
insert into your path and determines which commands are contained there. Except for the 
current directory ‘.’, which the shell treats specially, this means that if commands are added to 
a directory in your search path after you have started the shell, they will not necessarily be 
found by the shell. If you wish to use a command which has been added in this way, you 
should give the command 


rehash 


to the shell, which will cause it to recompute its internal table of command locations, so that 
it will find the newly added command. Since the shell has to look in the current directory ‘.’ 


Introduction to the C Shell 4-41 


on each command, placing it at the end of the path specification usually works equivalently 
and reduces overhead. 


Other useful built in variables are the variable home which shows your home directory, 
cwd which contains your current working directory, the variable ignoreeof which can be set in 
your .login file to tell the shell not to exit when it receives an end-of-file from a terminal (as 
described above). The variable ‘ignoreeof’ is one of several variables which the shell does not 
care about the value of, only whether they are set or unset. Thus to set this variable you sim- 
ply do 


set ignoreeof 
and to unset it do 
unset ignoreeof 


These give the variable ‘ignoreeof’ no value, but none is desired or required. 


Finally, some other built-in shell variables of use are the variables noclobber and mail. 
The metasyntax 


> filename 


which redirects the standard output of a command will overwrite and destroy the previous 
contents of the named file. In this way you may accidentally overwrite a file which is valu- 
able. If you would prefer that the shell not overwrite files in this way you can 


set noclobber 
in your .login file. Then trying to do 
date > now 
would cause a diagnostic if ‘now’ existed already. You could type 
date >! now . , 
if you really wanted to overwrite the contents of ‘now’. The ‘>! is a special metasyntax indi- 
cating that clobbering the file is ok. 


2.3. The shell’s history list 


The shell can maintain a history list into which it places the words of previous com- 
mands. It is possible to use a notation to reuse commands or words from commands in form- 
ing new commands. This mechanism can be used to repeat previous commands or to correct 
minor typing mistakes in commands. 


The following figure gives a sample session involving typical usage of the history 
mechanism of the shell. In this example we have a very simple C program which has a bug 
(or two) in it in the file ‘bug.c’, which we ‘cat’ out on our terminal. We then try to run the C 
compiler on it, referring to the file again as ‘!$’, meaning the last argument to the previous 
command. Here the ‘!’ is the history mechanism invocation metacharacter, and the ‘$’ stands 
for the last argument, by analogy to ‘$’ in the editor which stands for the end of the line. The 
shell echoed the command, as it would have been typed without use of the history mechanism, 
and then executed it. The compilation yielded error diagnostics so we now run the editor on 
the file we were trying to compile, fix the bug, and run the C compiler again, this time refer- 
ring to this command simply as ‘!c’, which repeats the last command which started with the 
letter ‘c’. If there were other commands starting with ‘c’ done recently we could have said ‘!cc’ 
or even ‘!cc:p’ which would have printed the last command starting with ‘cc’ without executing 
it. 


*The space between the ‘!’ and the word ‘now’ is critical here, as ‘!now’ would be an invocation of the histo- 
ry mechanism, and have a totally different effect. 
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% cat bug.c 
main() 


printf("hello); 


} 
% ec '$ 
cc bug.c 
”bug.c”, line 4: newline in string or char constant 
*bug.c”, line 5: syntax error 
% ed !$ 
ed bug.c 
29 
4s/);/”&/p 
printf("hello”); 
w 
30 


q 

% Ne 

cc bug.c 

% a.out 

hello% !e 
ed bug.c 


30 

4s/lo/lo\\n/p 
printf("hello:n”); 

Ww 

32 


% 'c —o bug 
cc bug.c —o bug 
% size a.out bug 
a.out: 2784+364+1028 = 4176b = 0x1050b 
bug: 2784+364+1028 = 4176b = 0x1050b 
% Is —1 '* 
ls —1 a.out bug . 
—rwxr-xr-x 1 bill 3932 Dec 19 09:41 a.out 
-rwxr-xr-x 1 bill 3932 Dec 19 09:42 bug 
% bug 
hello 
% num bug.c | spp 
spp: Command not found. 
% tspptssp 
num bug.c | ssp 
1 main() 
3 { 
4 printf(”hello\n”); 
5 } 
% '"\Ipr 
num bug.c | ssp | Ipr 
% 
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After this recompilation, we ran the resulting ‘a.out’ file, and then noting that there still 
was a bug, ran the editor again. After fixing the program we ran the C compiler again, but 
tacked onto the command an extra ‘—o bug’ telling the compiler to place the resultant binary 
in the file ‘bug’ rather than ‘a.out’. In general, the history mechanisms may be used anywhere 
in the formation of new commands and other characters may be placed before and after the 
substituted commands. 


We then ran the ‘size’ command to see how large the binary program images we have 
created were, and then an ‘ls —l’ command with the same argument list, denoting the argu- 
ment list ‘*’. Finally we ran the program ‘bug’ to see that its output is indeed correct. 


To make a numbered listing of the program we ran the ‘num’ command on the file 
‘bug.c’. In order to compress out blank lines in the output of ‘num’ we ran the output through 
the filter ‘ssp’, but misspelled it as spp. To correct this we used a shell substitute, placing the 
old text and new text between ‘1’ characters. This is similar to the substitute command in the 
editor. Finally, we repeated the same command with ‘!!’, but sent its output to the line 
printer. 


There are other mechanisms available for repeating commands. The history command 
prints out a number of previous commands with numbers by which they can be referenced. 
There is a way to refer to a previous command by searching for a string which appeared in it, 
and there are other, less useful, ways to select arguments to include in a new command. A 
complete description of all these mechanisms is given in the C shell manual pages in the UNIX 
Programmers Manual. 


2.4. Aliases 


The shell has an alias mechanism which can be used to make transformations on input 
commands. This mechanism can be used to simplify the commands you type, to supply 
default arguments to commands, or to perform transformations on commands and their argu- 
ments. The alias facility is similar to a macro facility. Some of the features obtained by alias- 
ing can be obtained also using shell command files, but these take place in another instance of 
the shell and cannot directly affect the current shells environment or involve commands such 
as cd which must be done in the current shell. 


As an example, suppose that there is a new version of the mail program on the system 
called ‘newmail’ you wish to use, rather than the standard mail program which is called ‘mail’. 
If you place the shell command 


alias mail newmail 
in your .cshrc file, the shell will transform an input line of the form 
mail bill 


into a call on ‘newmail’. More generally, suppose we wish the command ‘ls’ to always show 
sizes of files, that is to always do ‘—s’. We can do 


alias ls ls —s 
or even 
alias dir ls —s 
creating a new command syntax ‘dir’ which does an ‘Is —s’. If we say 
dir “bill 
then the shell will translate this to 
Is —s /mnt/bill 


Thus the alias mechanism can be used to provide short names for commands, to provide 
default arguments, and to define new short commands in terms of other commands. It is also 
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possible to define aliases which contain multiple commands or pipelines, showing where the 
arguments to the original command are to be substituted using the facilities of the history 
mechanism. Thus the definition 


alias cd ‘cd \!* ; Is“ 


would do an Is command after each change directory cd command. We enclosed the entire 
alias definition in “’ characters to prevent most substitutions from occurring and the character 
‘’ from being recognized as a metacharacter. The ‘!’ here is escaped with a ‘X to prevent it 
from being interpreted when the alias command is typed in. The ‘X*’ here substitutes the 
entire argument list to the pre-aliasing cd command, without giving an error if there were no 
arguments. The ‘;’ separating commands is used here to indicate that one command is to be 
done and then the next. Similarly the definition 


alias whois ‘grep \! t /etc/passwd’ 


defines a command which looks up its first argument in the password file. 


Warning: The shell currently reads the .cshrc file each time it starts up. If you place a 
large number of commands there, shells will tend to start slowly. A mechanism for saving the 
shell environment after reading the .cshrc file and quickly restoring it is under development, 
but for now you should try to limit the number of aliases you have to a reasonable number... 
10 or 15 is reasonable, 50 or 60 will cause a noticeable delay in starting up shells, and make 
the system seem sluggish when you execute commands from within the editor and other pro- 
grams. 


2.5. More redirection; >> and >& 


There are a few more notations useful to the terminal user which have not been intro- 
duced yet. 


In addition to the standard output, commands also have a diagnostic output which is 
normally directed to the terminal even when the standard output is redirected to a file or a 
pipe. It is occasionally desirable to direct the diagnostic output along with the standard out- 
put. For instance if you want to redirect the output of a long running command into a file 
and wish to have a record of any error diagnostic it produces you can do 


command >& file 


The ‘>&’ here tells the shell to route both the diagnostic output and the standard output into 
‘file’. Similarly you can give the command 


command |& Ipr 


to route both standard and diagnostic output through the pipe to the line printer daemon 
lpr.# 


Finally, it is possible to use the form 
command >> file 
to place output at the end of an existing file.t 
#A command form 
command >&! file 
exists, and is used when noclobber is set and file already exists. 
tIf noclobber is set, then an error will result if file does not exist, otherwise the shell will create file if it 
doesn’t exist. A form 


command >>! file 


makes it not be an error for file to not exist when noclobber is set. 


Introduction to the C Shell 4-45 


2.6. Jobs; Background, Foreground, or Suspended 


When one or more commands are typed together as a pipeline or as a sequence of com- 
mands separated by semicolons, a single job is created by the shell consisting of these com- 
mands together as a unit. Single commands without pipes or semicolons create the simplest 
jobs. Usually, every line typed to the shell creates a job. Some lines that create jobs (one per 
line) are 


sort < data 
ls —s | sort —n| head —5 
mail harold 


If the metacharacter ‘&’ is typed at the end of the commands, then the job is started as 
a background job. This means that the shell does not wait for it to complete but immediately 
prompts and is ready for another command. The job runs in the background at the same 
time that normal jobs, called foreground jobs, continue to be read and executed by the shell 
one at a time. Thus 


du > usage & 


would run the du program, which reports on the disk usage of your working directory (as well 
as any directories below it), put the output into the file ‘usage’ and return immediately with a 
prompt for the next command without out waiting for du to finish. The du program would 
continue executing in the background until it finished, even though you can type and execute 
more commands in the mean time. When a background job terminates, a message is typed by 
the shell just before the next prompt telling you that the job has completed. In the following 
example the du job finishes sometime during the execution of the mail command and its com- 
pletion is reported just before the prompt after the mail job is finished. 


% du > usage & 

[1] 503 

% mail bill 

How do you know when a background job is finished? 
EOT 

[1] — Done du > usage 

% 


If the job did not terminate normally the ‘Done’ message might say something else like 
‘Killed’. If you want the terminations of background jobs to be reported at the time they 
occur (possibly interrupting the output of other foreground jobs), you can set the notify vari- 
able. In the previous example this would mean that the ‘Done’ message might have come 
right in the middle of the message to Bill. Background jobs are unaffected by any signals 
from the keyboard like the STOP, INTERRUPT, or QUIT signals mentioned earlier. 


Jobs are recorded in a table inside the shell until they terminate. In this table, the shell 
remembers the command names, arguments and the process numbers of all commands in the 
job as well as the working directory where the job was started. Each job in the table is either 
running in the foreground with the shell waiting for it to terminate, running in the back- 
ground, or suspended. Only one job can be running in the foreground at one time, but several 
jobs can be suspended or running in the background at once. As each job is started, it is 
assigned a small identifying number called the job number which can be used later to refer to 
the job in the commands described below. Job numbers remain the same until the job ter- 
minates and then are re-used. 


When a job is started in the backgound using ‘&’, its number, as well as the process 
numbers of all its (top level) commands, is typed by the shell before prompting you for 
another command. For example, 
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% ls —s |sort —n > usage & 
[2] 2034 2035 
% 


runs the ‘ls’ program with the ‘—s’ options, pipes this output into the ‘sort’ program with the 
‘—n’ option which puts its output into the file ‘usage’. Since the ‘&’ was at the end of the line, 
these two programs were started together as a background job. After starting the job, the 
shell prints the job number in brackets (2 in this case) followed by the process number of each 
program started in the job. Then the shell immediates prompts for a new command, leaving 
the job running simultaneously. 


As mentioned in section 1.8, foreground jobs become suspended by typing tZ which 
sends a STOP signal to the currently running foreground job. A background job can become 
suspended by using the stop command described below. When jobs are suspended they 
merely stop any further progress until started again, either in the foreground or the back- 
gound. The shell notices when a job becomes stopped and reports this fact, much like it 
reports the termination of background jobs. For foreground jobs this looks like 


% du > usage 
tZ 

Stopped 

% 


‘Stopped’ message is typed by the shell when it notices that the du program stopped. For 
background jobs, using the stop command, it is 


% sort usage & 

[1] 2345 

% stop %1 

[1] + Stopped (signal) sort usage 
% 


Suspending foreground jobs can be very useful when you need to temporarily change what you 
are doing (execute other commands) and then return to the suspended job. Also, foreground 
jobs can be suspended and then continued as background jobs using the bg command, allow- 
ing you to continue other work and stop waiting for the foreground job to finish. Thus 


% du > usage 
tZ 

Stopped 

% bg 

[1] du > usage & 
% 


starts ‘du’ in the foreground, stops it before it finishes, then continues it in the background 
allowing more foreground commands to be executed. This is especially helpful when a fore- 
ground job ends up taking longer than you expected and you wish you had started it in the 
backgound in the beginning. 


All job control commands can take an argument that identifies a particular job. All job 
name arguments begin with the character ‘%’, since some of the job control commands also 
accept process numbers (printed by the ps command.) The default job (when no argument is 
given) is called the current job and is identified by a ‘+’ in the output of the jobs command, 
which shows you which jobs you have. When only one job is stopped or running in the back- 
ground (the usual case) it is always the current job thus no argument is needed. If a job is 
stopped while running in the foreground it becomes the current job and the existing current 
job becomes the previous job — identified by a ‘—’ in the output of jobs. When the current 
job terminates, the previous job becomes the current job. When given, the argument is either 
‘% —’ (indicating the previous job); ‘% #’, where # is the job number; ‘% pref’ where pref is 
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some unique prefix of the command name and arguments of one of the jobs; or ‘%?’ followed 
by some string found in only one of the jobs. 


The jobs command types the table of jobs, giving the job number, commands and status 
(‘Stopped’ or ‘Running’) of each backgound or suspended job. With the ‘—l’ option the pro- 
cess numbers are also typed. 


% du > usage & 

[1] 3398 

% 1s —s|sort —n > myfile & 

[2] 3405 

% mail bill 

tZ 

Stopped 

% jobs 

[1] Running du > usage 
[2] Running ls —s| sort —n > myfile 
[3] s Stopped mail bill 

% fg %ls 

ls —s|sort —n > myfile 

% more myfile 


The fg command runs a suspended or background job in the foreground. It is used to 
restart a previously suspended job or change a background job to run in the foreground 
(allowing signals or input from the terminal). In the above example we used fg to change the 
‘ls’ job from the background to the foreground since we wanted to wait for it to finish before 
looking at its output file. The bg command runs a suspended job in the background. It is 
usually used after stopping the currently running foreground job with the STOP signal. The 
combination of the STOP signal and the bg command changes a foreground job into a back- 
ground job. The stop command suspends a background job. 


The kill command terminates a background or suspended job immediately. In addition 
to jobs, it may be given process numbers as arguments, as printed by ps. Thus, in the example 
above, the running du command could have been terminated by the command 


% kill %1 
[1] Terminated du > usage 
% 


The notify command (not the variable mentioned earlier) indicates that the termination 
of a specific job should be reported at the time it finishes instead of waiting for the next 
prompt. 


If a job running in the background tries to read input from the terminal it is automati- 
cally stopped. When such a job is then run in the foreground, input can be given to the job. 
If desired, the job can be run in the background again until it requests input again. This is 
illustrated in the following sequence where the ‘s’ command in the text editor might take a 
long time. 


% ed bigfile 
120000 
1,$s/thisword/thatword/ 
tZ 
Stopped 
% bg 
[1] ed bigfile & 
% 
... some foreground commands 
[1] Stopped (tty input) ed bigfile 
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% fg 

ed bigfile 
Ww 

120000 


q 
%o 


So after the ‘s’ command was issued, the ‘ed’ job was stopped with (}Z and then put in the 
background using bg. Some time later when the ‘s’ command was finished, ed tried to read 
another command and was stopped because jobs in the backgound cannot read from the ter- 
minal. The fg command returned the ‘ed’ job to the foreground where it could once again 
accept commands from the terminal. 


The command 
stty tostop 


causes all background jobs run on your terminal to stop when they are about to write output 
to the terminal. This prevents messages from background jobs from interrupting foreground 
job output and allows you to run a job in the background without losing terminal output. It 
also can be used for interactive programs that sometimes have long periods without interac- 
tion. Thus each time it outputs a prompt for more input it will stop before the prompt. It 
can then be run in the foreground using fg, more input can be given and, if necessary stopped 
and returned to the background. This stty command might be a good thing to put in your 
.login file if you do not like output from background jobs interrupting your work. It also can 
reduce the need for redirecting the output of background jobs if the output is not very big: 


% stty tostop 

% we hugefile & 
[1] 10387 

% ed text 

... some time later 


q 
[1] Stopped (tty output) we hugefile 
% fg we 
we hugefile 
13371 30123 302577 
% stty —tostop 


Thus after some time the ‘wc’ command, which counts the lines, words and characters in a 
file, had one line of output. When it tried to write this to the terminal it stopped. By restart- 
ing it in the foreground we allowed it to write on the terminal exactly when we were ready to 
look at its output. Programs which attempt to change the mode of the terminal will also 
block, whether or not tostop is set, when they are not in the foreground, as it would be very 
unpleasant to have a background job change the state of the terminal. 


Since the jobs command only prints jobs started in the currently executing shell, it 
knows nothing about background jobs started in other login sessions or within shell files. The 
ps can be used in this case to find out about background jobs not started in the current shell. 


2.7. Working Directories 


As mentioned in section 1.6, the shell is always in a particular working directory. The 
‘change directory’ command chdir (its short form cd may also be used) changes the working 
directory of the shell, that is, changes the directory you are located in. 


It is useful to make a directory for each project you wish to work on and to place all files 
related to that project in that directory. The ‘make directory’ command, mkdir, creates a new 
directory. The pwd (‘print working directory’) command reports the absolute pathname of 
the working directory of the shell, that is, the directory you are located in. Thus in the 
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example below: 


% pwd 

/usr/bill 

% mkdir newpaper 
% chdir newpaper 
% pwd 
/usr/bill/newpaper 
% 


the user has created and moved to the directory newpaper. where, for example, he might 
place a group of related files. 
No matter where you have moved to in a directory hierarchy, you can return to your 
‘home’ login directory by doing just 
cd 


with no arguments. The name “.. 
hierarchy, thus 


cd .. 


changes the shell’s working directory to the one directly above the current one. The name ‘..’ 
can be used in any pathname, thus, 


always means the directory above the current one in the 


cd ../programs 


means change to the directory ‘programs’ contained in the directory above the current one. If 
you have several directories for different projects under, say, your home directory, this short- 
hand notation permits you to switch easily between them. 


The shell always remembers the pathname of its current working directory in the vari- 
able cwd. The shell can also be requested to remember the previous directory when you 
change to a new working directory. If the ‘push directory’ command pushd is used in place of 
the cd command, the shell saves the name of the current working directory on a directory 
stack before changing to the new one. You can see this list at any time by typing the ‘direc- 
tories’ command dirs. 


% pushd newpaper/references 
~/newpaper/references ~ 

% pushd /usr/lib/tmac 
/usr/lib/tmac ~/newpaper/references 
% dirs 

/usr/lib/tmac ~/newpaper/references ~ 
% popd 
~/newpaper/references 
% popd 


~ 


~ 


% 


The list is printed in a horizontal line, reading left to right, with a tilde (~) as shorthand for 
your home directory—in this case ‘/usr/bill’. The directory stack is printed whenever there is 
more than one entry on it and it changes. It is also printed by a dirs command. Dirs is usu- 
ally faster and more informative than pwd since it shows the current working directory as well 
as any other directories remembered in the stack. 


The pushd command with no argument alternates the current directory with the first 
directory in the list. The ‘pop directory’ popd command without an argument returns you to 
the directory you were in prior to the current one, discarding the previous current directory 
from the stack (forgetting it). Typing popd several times in a series takes you backward 
through the directories you had been in (changed to) by pushd command. There are other 
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options to pushd and popd to manipulate the contents of the directory stack and to change to 
directories not at the top of the stack; see the csh manual page for details. 


Since the shell remembers the working directory in which each job was started, it warns 
you when you might be confused by restarting a job in the foreground which has a different 
working directory than the current working directory of the shell. Thus if you start a back- 
ground job, then change the shell’s working directory and then cause the background job to 
run in the foreground, the shell warns you that the working directory of the currently running 
foreground job is different from that of the shell. 


% dirs —1 
/mnt/bill 

% ed myproject 
% dirs 
~/myproject 

% ed prog.c 
1143 

tZ 

Stopped 

% ed.. 

% Is 

myproject 
textfile 

% fg 

ed prog.c (wd: ~/myproject) 


This way the shell warns you when there is an implied change of working directory, even 
though no cd command was issued. In the above example the ‘ed’ job was still in 
‘/mnt/bill/project’ even though the shell had changed to ‘/mnt/bill’. A similar warning is given 
when such a foreground job terminates or is suspended (using the STOP signal) since the 
return to the shell again implies a change of working directory. 


% fg 

ed prog.c (wd: ~/myproject) 
... after some editing 

q 

(wd now: ~) 

% 


These messages are sometimes confusing if you use programs that change their own working 
directories, since the shell only remembers which directory a job is started in, and assumes it 
stays there. The ‘—]’ option of jobs will type the working directory of suspended or back- 
ground jobs when it is different from the current working directory of the shell. 


2.8. Useful built-in commands 


We now give a few of the useful built-in commands of the shell describing how they are 
used. 


The alias command described above is used to assign new aliases and to show the exist- 
ing aliases. With no arguments it prints the current aliases. It may also be given only one 
argument such as 


alias ls 


to show the current alias for, e.g., ‘Is’. 


The echo command prints its arguments. It is often used in shell scripts or as an 
interactive command to see what filename expansions will produce. 
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The history command will show the contents of the history list. The numbers given 
with the history events can be used to reference previous events which are difficult to refer- 
ence using the contextual mechanisms introduced above. There is also a shell variable called 
prompt. By placing a ‘!’ character in its value the shell will there substitute the number of the 
current command in the history list. You can use this number to refer to this command in a 
history substitution. Thus you could 


set prompt=" \! % 


679 


Note that the ‘!’ character had to be escaped here even within “’ characters. 


The limit command is used to restrict use of resources. With no arguments it prints the 
current limitations: 


cputime unlimited 
filesize unlimited 
datasize 5616 kbytes 
stacksize 512 kbytes 


coredumpsize unlimited 
Limits can be set, e.g.: 
limit coredumpsize 128k 


Most reasonable units abbreviations will work; see the csh manual page for more details. 
The logout command can be used to terminate a login shell which has ignoreeof set. 


The rehash command causes the shell to recompute a table of where commands are 
located. This is necessary if you add a command to a directory in the current shell’s search 
path and wish the shell to find it, since otherwise the hashing algorithm may tell the shell that 
the command wasn’t in that directory when the hash table was computed. 


The repeat command can be used to repeat a command several times. Thus to make 5 
copies of the file one in the file five you could do 


repeat 5 cat one >> five 


The setenv command can be used to set variables in the environment. Thus 
setenv TERM adm3a 


will set the value of the environment variable TERM to ‘adm3a’. A user program printenv 
exists which will print out the environment. It might then show: 


% printenv 

HOME=/usr/bill 

SHELL=/bin/csh 
PATH=:/usr/ucb:/bin:/usr/bin:/usr/local 


TERM=adm3a 
USER=bill 
% 


The source command can be used to force the current shell to read commands from a 
file. Thus 


source .cshrc 
can be used after editing in a change to the .cshrc file which you wish to take effect before the 
next time you login. 


The time command can be used to cause a command to be timed no matter how much 
CPU time it takes. Thus 
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% time cp /etc/rc /usr/bill/re 
0.0u 0.1s 0:01 8% 2+1k 3+2i0 1pf+0w 
% time we /etc/rc /usr/bill/rc 

52 178 13847 /etc/rc 

52 178 1347 /usr/bill/rc 

104 356 2694 total 
0.1u 0.1s 0:00 13% 3+3k 5+3i0 7pf+0w 
% 


indicates that the cp command used a negligible amount of user time (u) and about 1/10th of 
a system time (s); the elapsed time was 1 second (0:01), there was an average memory usage of 
2k bytes of program space and 1k bytes of data space over the cpu time involved (2+1k); the 
program did three disk reads and two disk writes (3+2io0), and took one page fault and was 
not swapped (1pf+0w). The word count command we on the other hand used 0.1 seconds of 
user time and 0.1 seconds of system time in less than a second of elapsed time. The percen- 
tage ‘13%’ indicates that over the period when it was active the command ‘we’ used an aver- 
age of 13 percent of the available CPU cycles of the machine. 


The unalias and unset commands can be used to remove aliases and variable definitions 
from the shell, and unsetenv removes variables from the environment. 


2.9. What else? 


This concludes the basic discussion of the shell for terminal users. There are more 
features of the shell to be discussed here, and all features of the shell are discussed in its 
manual pages. One useful feature which is discussed later is the foreach built-in command 
which can be used to run the same command sequence with a number of different arguments. 


If you intend to use UNIX a lot you yeu should look through the rest of this document 
and the shell manual pages to become familiar with the other facilities which are available to 
you. 


Introduction to the C Shell 4-53 


3. Shell control structures and command scripts 


3.1. Introduction 


It is possible to place commands in files and to cause shells to be invoked to read and 
execute commands from these files, which are called shell scripts. We here detail those 
features of the shell useful to the writers of such scripts. 


3.2. Make 


It is important to first note what shell scripts are not useful for. There is a program 
called make which is very useful for maintaining a group of related files or performing sets of 
operations on related files. For instance a large program consisting of one or more files can 
have its dependencies described in a makefile which contains definitions of the commands 
used to create these different files when changes occur. Definitions of the means for printing 
listings, cleaning up the directory in which the files reside, and installing the resultant pro- 
grams are easily, and most appropriately placed in this makefile. This format is superior and 
preferable to maintaining a group of shell procedures to maintain these files. 


Similarly when working on a document a makefile may be created which defines how 
different versions of the document are to be created and which options of nroff or troff are 
appropriate. 


3.3. Invocation and the argv variable 


A csh command script may be interpreted by saying 
% csh script ... 


where script is the name of the file containing a group of csh commands and “...’ is replaced 
by a sequence of arguments. The shell places these arguments in the variable argu and then 
begins to read commands from the script. These parameters are then available through the 
same mechanisms which are used to reference any other shell variables. 


If you make the file ‘script’ executable by doing 
chmod 755 script 


and place a shell comment at the beginning of the shell script (i.e. begin the file with a ‘#’ 
character) then a ‘/bin/csh’ will automatically be invoked to execute ‘script’ when you type 
script 


If the file does not begin with a ‘#’ then the standard shell ‘/bin/sh’ will be used to execute it. 
This allows you to convert your older shell scripts to use csh at your convenience. 


3.4. Variable substitution 


After each input line is broken into words and history substitutions are done on it, the 
input line is parsed into distinct commands. Before each command is executed a mechanism 
know as variable substitution is done on these words. Keyed by the character ‘$’ this substi- 
tution replaces the names of variables by their values. Thus 


echo $argv 


when placed in a command script would cause the current value of the variable argu to be 
echoed to the output of the shell script. It is an error for argu to be unset at this point. 


A number of notations are provided for accessing components and attributes of variables. 
The notation 


$?name 


expands to ‘1’ if name is set or to ‘0’ if name is not set. It is the fundamental mechanism used 
for checking whether particular variables have been assigned values. All other forms of 
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reference to undefined variables cause errors. 
The notation 
$t#tname 
expands to the number of elements in the variable name. Thus 
% set argv=(a b c) 
% echo $?argv 
1 
% echo $#argv 
3 
% unset argv 
% echo $?argv 
0 
% echo $argv 
Undefined variable: argv. 
% 


It is also possible to access the components of a variable which has several values. Thus 
$argv[1] 
gives the first component of argu or in the example above ‘a’. Similarly 
$argv[$#argv} 
would give ‘c’, and 
$argv[1—2] 
would give ‘a b’. Other notations useful in shell scripts are 
$n 
where n is an integer as a shorthand for 
$argv[n | 
the nth parameter and 
ge 
which is a shorthand for 
$argv 
The form 
$$ 


expands to the process number of the current shell.. Since this process number is unique in 
the system it can be used in generation of unique temporary file names. The form 


$< 
is quite special and is replaced by the next line of input read from the shell’s standard input 
(not the script it is reading). This is useful for writing shell scripts that are interactive, read- 
ing commands from the terminal, or even writing a shell script that acts as a filter, reading 
lines from its input file. Thus the sequence 


echo ’yes or no?\c’ 
set a=($<) 


would write out the prompt ‘yes or no?’ without a newline and then read the answer into the 
variable ‘a’. In this case ‘$#a’ would be ‘0’ if either a blank line or end-of-file (tD) was typed. 


Introduction to the C Shell 4-55 


One minor difference between ‘$n’ and ‘$argv[n]’ should be noted here. The form 
‘$argv(n]’ will yield an error if n is not in the range ‘1—$#argv’ while ‘$n’ will never yield an 
out of range subscript error. This is for compatibility with the way older shells handled 
parameters. 


Another important point is that it is never an error to give a subrange of the form ‘n—’; 
if there are less than n components of the given variable then no words are substituted. A 
range of the form ‘m—n’ likewise returns an empty vector without giving an error when m 
exceeds the riuntber of elements of the given variable, provided the subscript n is in range. 


3.5. Expressions 


In order for interesting shell scripts to be constructed it must be possible to evaluate 
expressions in the shell based on the values of variables. In fact, all the arithmetic operations 
of the language C are available in the shell with the same precedence that they have in C. In 
particular, the operations ‘==’ and ‘!=’ compare strings and the operators ‘&&’ and db imple- 
ment the boolean and/or operations. The special operators ‘=~’ and ‘!”’ are similar to ‘==’ and 
‘=’ except that the string on the right side can have pattern matching characters (like *, ? or 
[]) and the test is whether the string on the left matches the pattern on the right. 


The shell also allows file enquiries of the form 
—? filename 
where ‘?’ is replace by a number of single characters. For instance the expression primitive 
—e filename 
telKwhether the file ‘filename’ exists. Other primitives test for read, write and execute access 
to the file, whether it is a directory, or has non-zero length. 


It is possible to test whether a command terminates normally, by a primitive of the form 
‘{ command }’ which returns true, i.e. ‘1’ if the command succeeds exiting normally with exit 
status 0, or ‘0’ if the command terminates abnormally or with exit status non-zero. If more 
detailed information about the execution status of a command is required, it can be executed 
and the variable ‘$status’ examined in the next command. Since ‘$status’ is set by every com- 
mand, it is very transient. It can be saved if it is inconvenient to use it only in the single 
immediately following command. 


For a full list of expression components available see the manual section for the shell. 


3.6. Sample shell script 


A sample shell script which makes use of the expression mechanism of the shell and 
some of its control structure follows: 
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- % cat copyc 
# Copyc copies those C programs in the specified list 
4 to the directory ~/backup if they differ from the files 
--.. # already in ~/backup 
s tt 
~ get noglob 
~~~ foreach i ($argv) 


if ($i !" *.c) continue # not a.c file so do nothing 


if (! -r ~/backup/$i:t) then 
echo $i:t not in backup... not cpxed 
continue 
endif 


cmp —s $i ~/backup/$i:t # to set $status 


if ($status != 0) then — a 
echo new backup of $i 
cp $i ~/backup/$i:t 
endif 
end 


This script makes use of the foreach command, which causes the shell to execute the 
commands between the foreach and the matching end for each of the values given between ‘(’ 
and ‘)’ with the named variable, in this case ‘i’ set to successive values in the list. Within this 
loop we may use the command break to stop executing the loop and continue to prematurely 
terminate one iteration and begin the next. After the foreach loop the iteration variable (i in 
this case) has the value at the last iteration. 


We set the variable noglob here to prevent filename expansion of the members of argu. 
This is a good idea, in general, if the arguments to a shell script are filenames which have 
already been expanded or if the arguments may contain filename expansion metacharacters. 
It is also possible to quote each use of a ‘$’ variable expansion, but this is harder and less reli- 
able. 


The other control construct used here is a statement of the form 


if ( expression ) then 
command 


endif 


The placement of the keywords here is not flexible due to the current implementation of the 
shell. ft 


+The following two formats are not currently acceptable to the shell: 


if ( expression ) # Won’t work! 
then 

command 
endif 


and 


if ( expression ) then command endif # Won’t work 
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The shell does have another form of the if statement of the form 
if ( expression ) command 
which can be written 


if ( expression ) \ 
command 


Here we have escaped the newline for the sake of appearance. The command must not 
involve ‘|’, ‘&’ or ‘;’ and must not be another control command. The second form requires the 
final ‘\ to immediately precede the end-of-line. 


The more general if statements above also admit a sequence of else—if pairs followed by 
a single else and an endif, e.g.: 


if ( expression ) then 
commands 

else if (expression ) then 
commands 


else 
commands 
endif 


‘ 


Another important mechanism used in shell scripts is the ‘:’ modifier. We can use the 
modifier ‘:r’- here to extract arootjof a filename or ‘:e’ to extract the extension.) Thus if the 
variable i has the value ‘/mnt/foo.bar’ then 


% echo $i $i:r $i:e 
/mnt/foo.bar /mnt/foo bar 
% 


shows how the ‘:r’ modifier strips off the trailing ‘.bar’ and the the ‘:e’ modifier leaves only the 
‘bar’. Other modifiers will take off the last component of a pathname leaving the head ‘:h’ or 
all but the last component of a pathname leaving the tail ‘:t’. These modifiers are fully 
described in the csh manual pages in the programmers manual. It is also possible to use the 
command substitution mechanism described in the next major section to perform 
modifications on strings to then reenter the shells environment. Since each usage of this 
mechanism involves the creation of a new process, it is much more expensive to use than the 
‘” modification mechanism.# Finally, we note that the character ‘#’ lexically introduces a 
shell comment in shell scripts (but not from the terminal). All subsequent characters on the 
input line after a ‘#’ are discarded by the shell. This character can be quoted using “’ or ‘Y to 
place it in an argument word. 


3.7. Other control structures 
The shell also has control structures while and switch similar to those of C. These take 
the forms 


#It is also important to note that the current implementation of the shell limits the number of ‘:’ modifiers 
on a ‘$’ substitution to 1. Thus 


% echo $i $i:h:t 
/a/b/c /a/b:t 
% 


does not do what one would expect. 
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while ( expression ) 
commands 
end 
and . 


switch ( word ) 


case strl: 
commands 
breaksw 


case strn: 
commands 
breaksw 


default: 
commands 
breaksw 


endsw 


For details see the manual section for csh. C programmers should note that we use breaksw to 
exit from a switch while break exits a while or foreach loop. A common mistake to make in 
csh scripts is to use break rather than breaksw in switches. 


Finally, csh allows a goto statement, with labels looking like they do in C, i.e.: 


loop: 
commands 
goto loop 


3.8. Supplying input to commands 

Commands run from shell scripts receive by default the standard input of the shell 
which is running the script. This is different from previous shells running under UNIX. It 
allows shell scripts to fully participate in pipelines, but mandates extra notation for com- 
mands which are to take inline data. 

Thus we need a metanotation for supplying inline data to commands in shell scripts. As 
an example, consider this script which runs the editor to delete leading blanks from the lines 
in each argument file . 


% cat deblank 

# deblank —— remove leading blanks 
foreach i ($argv) 

ed — $i << EOF’ 

1,$s/t[ ]*// 


WwW 


q 
“EOF” 
end 
% 


The notation ‘<< “EOF” means that the standard input for the ed command is to come from 
the text in the shell script file up to the next line consisting of exactly “EOF”. The fact that 
the ‘EOF’ is enclosed in “’ characters, i.e. quoted, causes the shell to not perform variable 
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substitution on the intervening lines. In general, if any part of the word following the ‘<<’ 
which the shell uses to terminate the text to be given to the command is quoted then these 
substitutions will not be performed. In this case since we used the form ‘1,$’ in our editor 
script we needed to insure that this ‘$’ was not variable substituted. We could also have 
insured this by preceding the ‘$’ here with a ‘\’, i.e.: 


1,\$s/tl 1*// 


but quoting the ‘EOF’ terminator is a more reliable way of achieving the same thing. 


3.9. "Catching interrupts 


If our shell script creates temporary files, we may wish to catch interruptions of the shell 
script so that we can clean up these files. We can then do 


onintr label 


where label is a label in our program. If an interrupt is received the shell will do a ‘goto label’ 
and we can remove the temporary files and then do an exit command (which is built in to the 
shell) to exit from the shell script. If we wish to exit with a non-zero status we can do 


exit(1) 


e.g. to exit with status ‘1’. 


3.10. What else? 


There are other features of the shell useful to writers of shell procedures. The verbose 
and echo options and the related —v and —x command line options can be used to help trace 
the actions of the shell. The —n option causes the shell only to read commands and not to 
execute them and may sometimes be of use. 


One other thing to note is that csh will not execute shell scripts which do not begin with 
the character ‘#’, that is shell scripts that do not begin with a comment. Similarly, the 
‘/bin/sh’ on your system may well defer to ‘csh’ to interpret shell scripts which begin with ‘#’. 
This allows shell scripts for both shells to live in harmony. 


699 


There is also another quotation mechanism using which allows only some of the 
expansion mechanisms we have so far discussed to occur on the quoted string and serves to 
make this string into a single word as “’ does. 
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4. Other, less commonly used, shell features 


4.1. Loops at the terminal]; variables as vectors 


It is occasionally useful to use the foreach control structure at the terminal to aid in per- 
forming a number of similar commands. For instance, there were at one point three shells in 
use on the Cory UNIX system at Cory Hall, ‘/bin/sh’, ‘/bin/nsh’, and ‘/bin/csh’. To count the 
number of persons using each shell one could have issued the commands 


% grep —c esh$ /etc/passwd 
27 

% grep —c hsh$ /etc/passwd 
128 

% grep —c —v sh$ /etc/passwd 
430 

% 


Since these commands are very similar we can use foreach to do this more easily. 


% foreach i (sh$’ “csh$’ “—v sh$’) 
? grep —c $i /etc/passwd 

? end 

27 

128 

430 

% 


Note here that the shell prompts for input with ‘? ’ when reading the body of the loop. 


Very useful with loops are variables which contain lists of filenames or other words. You 
can, for example, do 


% set a=(Is>) 
% echo $a 
csh.n csh.rm 
% Is 

csh.n 

csh.rm 

% echo $t#a 
2 

% 


The set command here gave the variable a a list of all the filenames in the current directory 
as value. We can then iterate over these names to perform any chosen function. 


The output of a command within ‘’ characters is converted by the shell to a list of 
words. You can also place the ‘’ quoted string within ‘*”’ characters to take each (non-empty) 
line as a component of the variable; preventing the lines from being split into words at blanks 
and tabs. A modifier ‘:x’ exists which can be used later to expand each component of the vari- 
able into another variable splitting it into separate words at embedded blanks and tabs. 


4,2. Braces {... } in argument expansion 
Another form of filename expansion, alluded to before involves the characters ‘{’ and ‘}’. 


6) 


These characters specify that the contained strings, separated by ‘,’ are to be consecutively 
substituted into the containing characters and the results expanded left to right. Thus 


A {str1,str2,...strn}B 


expands to 
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Astr1B Astr2B ... AstrnB 


This expansion occurs before the other filename expansions, and may be applied recursively 
(i.e. nested). The results of each expanded string are sorted separately, left to right order 
being preserved. The resulting filenames are not required to exist if no other expansion 
mechanisms are used. This means that this mechanism can be used to generate arguments 
which are not filenames, but which have common parts. 


A typical use of this would be 
mkdir ~/{hdrs,retrofit,csh} 


to make subdirectories ‘hdrs’, ‘retrofit’ and ‘csh’ in your home directory. This mechanism is 
most useful when the common prefix is longer than in this example, i.e. 


chown root /usr/{ucb/{ex,edit},lib/{ex?.?*,how ex}} 


4.3. Command substitution 


A command enclosed in ‘’ characters is replaced, just before filenames are expanded, by 
the output from that command. Thus it is possible to do 


set pwd= ‘pwd 
to save the current directory in the variable pwd or to do 
ex ‘grep —1 TRACE *.c 


to run the editor ex supplying as arguments those files whose names end in ‘.c’ which have the 
string ‘TRACE’ in them.* 


4.4. Other details not covered here 


In particular circumstances it may be necessary to know the exact nature and order of 
different substitutions performed by the shell. The exact meaning of certain combinations of 
quotations is also occasionally important. These are detailed fully in its manual section. 


The shell has a number of command line option flags mostly of use in writing UNIX pro- 
grams, and debugging shell scripts. See the shells manual section for a list of these options. 


*Command expansion also occurs in input redirected with ‘<<’ and within ‘”’ quotations. Refer to the shell 
manual section for full details. 
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Appendix — Special characters 

The following table lists the special characters of csh and the UNIX system, giving for each the 
section(s) in which it is discussed. A number of these characters also have special meaning in 
expressions. See the csh manual section for a complete list. 


Syntactic metacharacters 
: 2.4 separates commands to be executed sequentially 
| 1.5 separates commands in a pipeline 


() 2.2,3.6 brackets expressions and variable values 
& 2.5 follows commands to be executed without waiting for completion 


Filename metacharacters 


/ 1.6 separates components of a file’s pathname 
? 1.6 expansion character matching any single character 
. 1.6 expansion character matching any sequence of characters 


[] 1.6 expansion sequence matching any single character from a set 
7 1.6 used at the beginning of a filename to indicate home directories 
{} 4.2 used to specify groups of arguments with common parts 


Quotation metacharacters 


\ 1.7 prevents meta-meaning of following single character 
_ 1.7 prevents meta-meaning of a group of characters 
mh 4.3 like *, but allows variable and command expansion 


Input/output metacharacters 


< 1.5 indicates redirected input 
> 1.3 indicates redirected output 


Expansion/substitution metacharacters 


$ 3.4 indicates variable substitution 
! 2.3 indicates history substitution 

3.6 precedes substitution modifiers 

2.3 used in special forms of history substitution 
; 4.3 indicates command substitution 


Other metacharacters 


# 1.3,3.6 begins scratch file names; indicates shell comments 
_ 1.2 prefixes option (flag) arguments to commands 
% 2.6 prefixes job name specifications 
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Glossary 


This glossary lists the most important terms introduced in the introduction to the shell 
and gives references to sections of the shell document for further information about them. 
References of the form ‘pr (1)’ indicate that the command pr is in the UNIX programmer’s 
manual in section 1. You can get an online copy of its manual page by doing 


man 1 pr 


References of the form (2.5) indicate that more information can be found in section 2.5 of this 
manual. 


? 


Your current directory has the name ‘.’ as well as the name printed by the 
command pwd; see also dirs. The current directory ‘.’ is usually the first com- 
ponent of the search path contained in the variable path, thus commands 
which are in ‘.’ are found first (2.2). The character ‘.’ is also used in separat- 
ing components of filenames (1.6). The character ‘.’ at the beginning of a 
component of a pathname is treated specially and not matched by the 


filename expansion metacharacters ‘?’, ‘*’, and ‘[’ ‘]’ pairs (1.6). 


Kach directory has a file ‘..’ in it which is a reference to its parent directory. 
After changing into the directory with chdir, i.e. 


chdir paper 
you can return to the parent directory by doing 
chdir .. 


The current directory is printed by pwd (2.7). 


a.out Compilers which create executable images create them, by default, in the file 
a.out. for historical reasons (2.3). 


absolute pathname 
A pathname which begins with a ‘/’ is absolute since it specifies the path of 
directories from the beginning of the entire directory system — called the root 
directory. Pathnames which are not absolute are called relative (see 
definition of relative pathname) (1.6). 


alias An alias specifies a shorter or different name for a UNIX command, or a 
transformation on a command to be performed in the shell. The shell has a 
command alias which establishes aliases and can print their current values. 
The command unalias is used to remove aliases (2.4). 


argument Commands in UNIX receive a list of argument words. Thus the command 
echoabc 


consists of the command name ‘echo’ and three argument words ‘a’, ‘b’ and 
6? 


c’. The set of arguments after the command name is said to be the argu- 
ment list of the command (1.1). 


argv The list of arguments to a command written in the shell language (a shell 
script or shell procedure) is stored in a variable called argu within the shell. 
This name is taken from the conventional name in the C programming 
language (3.4). 


background Commands started without waiting for them to complete are called back- 
ground commands (2.6). 
base A filename is sometimes thought of as consisting of a base part, before any ‘.’ 


character, and an extension — the part after the ‘.’. See filename and exten- 
ston (1.6) 
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bg 


bin 


break 


breaksw 


builtin 


case 


cat 


cd 


chdir 


chsh 


cmp 


command 


command name 


The bg command causes a suspended job to continue execution in the back- 
ground (2.6). 


A directory containing binaries of programs and shell scripts to be executed is 
typically called a bin directory. The standard system bin directories are 
‘/bin’ containing the most heavily used commands and ‘/usr/bin’ which con- 
tains most other user programs. Programs developed at UC Berkeley live in 
‘/usr/ucb’, while locally written programs live in ‘/usr/local’. Games are kept 
in the directory ‘/usr/games’. You can place binaries in any directory. If you 
wish to execute them often, the name of the directories should be a com- 
ponent of the variable path. 


Break is a builtin command used to exit from loops within the control struc- 
ture of the shell (3.7). 


The breaksw builtin command is used to exit from a switch control structure, 
like a break exits from loops (3.7). 


A command executed directly by the shell is called a builtin command. Most 
commands in UNIX are not built into the shell, but rather exist as files in bin 
directories. These commands are accessible because the directories in which 
they reside are named in the path variable. 


A case command is used as a label in a switch statement in the shell’s control 
structure, similar to that of the language C. Details are given in the shell 
documentation ‘csh(1)’ (3.7). 


The cat program catenates a list of specified files on the standard output. It 
is usually used to look at the contents of a single file on the terminal, to ‘cat a 
file’ (1.8, 2.3). 


The cd command is used to change the working directory. With no argu- 
ments, cd changes your working directory to be your home directory (2.4, 
2.7). 


The chdir command is a synonym for cd. Cd is usually used because it is 
easier to type. 


The chsh command is used to change the shell which you use on UNIX. By 
default, you use an different version of the shell which resides in ‘/bin/sh’. 
You can change your shell to ‘/bin/csh’ by doing 


chsh your-login-name /bin/csh 
Thus I would do 
chsh bill /bin/csh 


It is only necessary to do this once. The next time you log in to UNIX after 


doing this command, you will be using csh rather than the shell in ‘/bin/sh’ 


(1.9). 


Cmp is a program which compares files. It is usually used on binary files, or 
to see if two files are identical (3.6). For comparing text files the program 
diff, described in ‘diff (1)’ is used. 


A function performed by the system, either by the shell (a builtin command) 
or by a program residing in a file in a directory within the UNIX system, is 
called a command (1.1). 


When a command is issued, it consists of a command name, which is the first 
word of the command, followed by arguments: The convention on UNIX is 
that the first word of a command names the function to be performed (1.1). 
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command substitution 


component 


continue 


control- 


core dump 


cp 
csh 
.cshre 


cwd 


date 
debugging 


default: 


DELETE 


detached 


diagnostic 


6s 


The replacement of a command enclosed in *’ characters by the text output 
by that command is called command substitution (4.38). 


A part of a pathname between ‘/’ characters is called a component of that 
pathname. A variable which has multiple strings as value is said to have 
several components; each string is a component of the variable. 


A builtin command which causes execution of the enclosing foreach or while 
loop to cycle prematurely. Similar to the continue command in the program- 
ming language C (3.6). 


Certain special characters, called control characters, are produced by holding 
down the CONTROL key on your terminal and simultaneously pressing another 
character, much like the SHIFT key is used to produce upper case characters. 
Thus control-c is produced by holding down the CONTROL key while pressing 
the ‘c’ key. Usually UNIX prints an up-arrow ({}) followed by the correspond- 
ing letter when you type a control character (e.g. ‘}C’ for control-c (1.8). 


When a program terminates abnormally, the system places an image of its 
current state in a file named ‘core’. This core dump can be examined with 
the system debugger ‘adb(1)’ or ‘sdb(1)’ in order to determine what went 
wrong with the program (1.8). If the shell produces a message of the form 


Illegal instruction (core dumped) 


(where ‘Illegal instruction’ is only one of several possible messages), you 
should report this to the author of the program or a system administrator, 
saving the ‘core’ file. 


The cp (copy) program is used to copy the contents of one file into another 
file. It is one of the most commonly used UNIX commands (1.6). 


The name of the shell program that this document describes. 


The file .cshre in your home directory is read by each shell as it begins execu- 
tion. It is usually used to change the setting of the variable path and to set 
alias parameters which are to take effect globally (2.1). 


The cwd variable in the shell holds the absolute pathname of the current 
working directory. It is changed by the shell whenever your current working 
directory changes and should not be changed otherwise (2.2). 


The date command prints the current date and time (1.3). 


Debugging is the process of correcting mistakes in programs and shell scripts. 
The shell has several options and variables which may be used to aid in shell 
debugging (4.4). 


The label default: is used within shell switch statements, as it is in the C 
language to label the code to be executed if none of the case labels matches 
the value switched on (3.7). 


The DELETE or RUBOUT key on the terminal normally causes an interrupt to 
be sent to the current job. Many users change the interrupt character to be 
tC. 


A command that continues running in the background after you logout is said 
to be detached. 


An error message produced by a program is often referred to as a diagnostic. 
Most error messages are not written to the standard output, since that is 
often directed away from the terminal (1.3, 1.5). Error messsages are instead 
written to the diagnostic output which may be directed away from the termi- 
nal, but usually is not. Thus diagnostics will usually appear on the terminal 
(2.5). 


4-66 Introduction to the C Shell 


directory 


directory stack 


dirs 
du 


echo 


else 


endif 


EOF 


escape 


/etc/passwd 


exit 


exit status" 


A structure which contains files. At any time you are in one particular direc- 
tory whose names can be printed by the command pwd. The chdir command 
will change you to another directory, and make the files in that directory 
visible. The directory in which you are when you first login is your home 
directory (1.1, 2.7). 


The shell saves the names of previous working directories in the directory 
stack when you change your current working directory via the pushd com- 
mand. The directory stack can be printed by using the dirs command, which 
includes your current working directory as the first directory name on the left 
(2.7). 


The dirs command prints the shell’s directory stack (2.7). 


The du command is a program (described in ‘du(1)’) which prints the number 
of disk blocks is all directories below and including your current working 
directory (2.6). 


The echo command prints its arguments (1.6, 3.6). 


The else command is part of the ‘if-then-else-endif’? control command con- 
struct (3.6). 


If an if statement is ended with the word then, all lines following the if up to 
a line starting with the word endif or else are executed if the condition 
between parentheses after the if is true (3.6). 


An end-of-file is generated by the terminal by a control-d, and whenever a 
command reads to the end of a file which it has been given as input. Com- 
mands receiving input from a pipe receive an end-of-file when the command 
sending them input completes. Most commands terminate when they receive 
an end-of-file. The shell has an option to ignore end-of-file from a terminal 
input which may help you keep from logging out accidentally by typing too 
many control-d’s (1.1, 1.8, 3.8). 


A character ‘\’ used to prevent the special meaning of a metacharacter is said 
to escape the character from its special meaning. Thus 


echo \* 
will echo the character ‘*’ while just 
echo * 


will echo the names of the file in the current directory. In this example, x 
escapes ‘*’ (1.7). There is also a non-printing character called escape, usually 
labelled ESC or ALTMODE on terminal keyboards. Some older UNIX systems 
use this character to indicate that output is to be suspended. Most systems 
use control-s to stop the output and control-q to start it. 


This file contains information about the accounts currently on the system. It 
consists of a line for each account with fields separated by ‘:’ characters (1.8). 
You can look at this file by saying 


cat /etc/passwd 


The commands finger and grep are often used to search for information in 
this file. See “finger(1)’, ‘passwd(5)’, and ‘grep(1)’ for more details. 

The exit command is used to force termination of a shell script, and is built 
into the shell (3.9). 


A command which discovers a problem may reflect this back to the command 
(such as a shell) which invoked (executed) it. It does this by returning a 
non-zero number as its exit status, a status of zero being considered ‘normal 


expansion 


expressions 
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fg 


filename 
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termination’. The exit command can be used to force a shell command script 
to give a non-zero exit status (3.6). 


The replacement of strings in the shell input which contain metacharacters by 
other strings is referred to as the process of expansion. Thus the replace- 
ment of the word ‘*’ by a sorted list of files in the current directory is a 
‘filename expansion’. Similarly the replacement of the characters ‘!!’ by the 
text of the last command is a ‘history expansion’. Expansions are also 
referred to as substitutions (1.6, 3.4, 4.2). 


Expressions are used in the shell to control the conditional structures used in 
the writing of shell scripts and in calculating values for these scripts. The 
operators available in shell expressions are those of the language C (3.5). 
Filenames often consist of a base name and an extension separated by the 
character ‘.’. By convention, groups of related files often share the same root 
name. Thus if ‘prog.c’ were a C program, then the object file for this program 
would be stored in ‘prog.o’. Similarly a paper written with the ‘—me’ nroff 
macro package might be stored in ‘paper.me’ while a formatted version of this 
paper might be kept in ‘paper.out’ and a list of spelling errors in ‘paper.errs’ 
(1.6). 


The job control command fg is used to run a background or suspended job 
in the foreground (1.8, 2.6). 


Each file in UNIX has a name consisting of up to 14 characters and not includ- 
ing the character ‘/’ which is used in pathname building. Most filenames do 
not begin with the character ‘.’, and contain only letters and digits with 
perhaps a ‘.’ separating the base portion of the filename from an extension 
(1.6). 


filename expansion 


flag 


foreach 


foreground 


goto 


grep 


Filename expansion uses the metacharacters ‘*’, ‘?’ and ‘[’ and ‘]’ to provide 
a convenient mechanism for naming files. Using filename expansion it is 
easy to name all the files in the current directory, or all files which have a 
common root name. Other filename expansion mechanisms use the meta- 
character ‘“’ and allow files in other users’ directories to be named easily (1.6, 
4.2). 


Many UNIX commands accept arguments which are not the names of files or 
other users but are used to modify the action of the commands. These are 
referred to as flag options, and by convention consist of one or more letters 
preceded by the character ‘—’ (1.2). Thus the /s (list files) command has an 
option ‘—s’ to list the sizes of files. This is specified 


ls —s 


The foreach command is used in shell scripts and at the terminal to specify 
repetition of a sequence of commands while the value of a certain shell vari- 
able ranges through a specified list (3.6, 4.1). 


When commands are executing in the normal way such that the shell is wait- 
ing for them to finish before prompting for another command they are said to 
be foreground jobs or running in the foreground. This is as opposed to 
background. Foreground jobs can be stopped by signals from the terminal 
caused by typing different control characters at the keyboard (1.8, 2.6). 


The shell has a command goto used in shell scripts to transfer control to a 
given label (3.7). 


The grep command searches through a list of argument files for a specified 
string. Thus 
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head 


history 


home directory 


if 


ignoreeof 


input 


interrupt 


grep bill /etc/passwd 


will print each line in the file /etc/passwd which contains the string ‘bill’. 
Actually, grep scans for regular expressions in the sense of the editors ‘ed(1)’ 
and ‘ex(1)’. Grep stands for ‘globally find regular expression and print’ (2.4). 


The head command prints the first few lines of one or more files. If you have 
a bunch of files containing text which you are wondering about it is some- 
times useful to run head with these files as arguments. This will usually 
show enough of what is in these files to let you decide which you are 
interested in (1.5). 

Head is also used to describe the part of a pathname before and including 
the last ‘/ character. The tail of a pathname is the part after the last ‘/’. 
The ‘:h’ and ‘:t’ modifiers allow the head or tail of a pathname stored in a 
shell variable to be used (3.6). 


The history mechanism of the shell allows previous commands to be 
repeated, possibly after modification to correct typing mistakes or to change 
the meaning of the command. The shell has a history list where these com- 
mands are kept, and a history variable which controls how large this list is 
(2.3). 


Each user has a home directory, which is given in your entry in the password 
file, /etc/passwd. This is the directory which you are placed in when you first 
login. The cd or chdir command with no arguments takes you back to this 
directory, whose name is recorded in the shell variable home. You can also 
access the home directories of other users in forming filenames using a 
filename expansion notation and the character “’ (1.6). 


A conditional command within the shell, the if command is used in shell com- 
mand scripts to make decisions about what course of action to take next (3.6). 


Normally, your shell will exit, printing ‘logout’ if you type a control-d at a 
prompt of ‘% ’. This is the way you usually log off the system. You can set 
the ignoreeof variable if you wish in your .login file and then use the com- 
mand logout to logout. This is useful if you sometimes accidentally type too 
many control-d characters, logging yourself off (2.2). _ 


Many commands on UNIX take information from the terminal or from files 
which they then act on. This information is called input. Commands nor- 
mally read for input from their standard input which is, by default, the ter- 
minal. This standard input can be redirected from a file using a shell 
metanotation with the character ‘<’. Many commands will also read from a 
file specified as argument. Commands placed in pipelines will read from the 
output of the previous command in the pipeline. The leftmost command in a 
pipeline reads from the terminal if you neither redirect its input nor give it a 
filename to use as standard input. Special mechanisms exist for supplying 
input to commands in shell scripts (1.5, 3.8). 


An interrupt is a signal to a program that is generated by hitting the RUBOUT 
or DELETE key (although users can and often do change the interrupt charac- 
ter, usually to tC). It causes most programs to stop execution. Certain pro- 
grams, such as the shell and the editors, handle an interrupt in special ways, 
usually by stopping what they are doing and prompting for another command. 
While the shell is executing another command and waiting for it to finish, the 
shell does not listen to interrupts. The shell often wakes up when you hit 
interrupt because many commands die when they receive an interrupt (1.8, 
3.9). 
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6.) 


One or more commands typed on the same input line separated by ‘” or ‘; 
characters are run together and are called a job. Simple commands run by 
themselves without any ‘? or ‘;’ characters are the simplest jobs. Jobs are 
classified as foreground, background, or suspended (2.6). 


The builtin functions that control the execution of jobs are called job control 
commands. These are bg, fg, stop, kill (2.6). 


When each job is started it is assigned a small number called a job number 
which is printed next to the job in the output of the jobs command. This 
number, preceded by a ‘%’ character, can be used as an argument to job con- 
trol commands to indicate a specific job (2.6). 


The jobs command prints a table showing jobs that are either running in the 
background or are suspended (2.6). 


A command which sends a signal to a job causing it to terminate (2.6). 


The file .login in your home directory is read by the shell each time you login 
to UNIX and the commands there are executed. There are a number of com- 
mands which are usefully placed here, especially set commands to the shell 
itself (2.1). 


The shell that is started on your terminal when you login is called your login 
shell. It is different from other shells which you may run (e.g. on shell 
scripts) in that it reads the .login file before reading commands from the ter- 
minal and it reads the .logout file after you logout (2.1). 


The logout command causes a login shell to exit. Normally, a login shell will 
exit when you hit control-d generating an end-of-file, but if you have set 
ignoreeof in you .login file then this will not work and you must use logout to 
log off the UNIX system (2.8). 


When you log off of UNIX the shell will execute commands from the file 
.logout in your home directory after it prints ‘logout’. 


The command lpr is the line printer daemon. The standard input of lpr 
spooled and printed on the UNIX line printer. You can also give lpr a list of 
filenames as arguments to be printed. It is most common to use lpr as the 
last component of a pipeline (2.3). 


The ls (list files) command is one of the most commonly used UNIX com- 
mands. With no argument filenames it prints the names of the files in the 
current directory. It has a number of useful flag arguments, and can also be 
given the names of directories as arguments, in which case it lists the names 
of the files in these directories (1.2). 


The mail program is used to send and receive messages from other UNIX users 
(1.1, 2.1). 


The make command is used to maintain one or more related files and to 
organize functions to be performed on these files. In many ways make is 
easier to use, and more helpful than shell command scripts (3.2). 


The file containing commands for make is called makefile (3.2). 


The manual often referred to is the ‘UNIX programmer’s manual’. It contains 
a number of sections and a description of each UNIX program. An online ver- 
sion of the manual is accessible through the man command. Its documenta- 
tion can be obtained online via 


man man 


Many characters which are neither letters nor digits have special meaning 
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either to the shell or to UNIX. These characters are called metacharacters. If 
it is necessary to place these characters in arguments to commands without 
them having their special meaning then they must be quoted. An example of 
a metacharacter is the character ‘>’ which is used to indicate placement of 
output into a file. For the purposes of the history mechanism, most 
unquoted metacharacters form separate words (1.4). The appendix to this 
user’s manual lists the metacharacters in groups by their function. 


The mkdir command is used to create a new directory. 

Substitutions with the history mechanism, keyed by the character ‘!’ or of 
variables using the metacharacter ‘$’, are often subjected to modifications, 
indicated by placing the character ‘:’ after the substitution and following this 
with the modifier itself. The command substitution mechanism can also be 
used to perform modification in a similar way, but this notation is less clear 
(3.6). 


The program more writes a file on your terminal allowing you to control how 
much text is displayed at a time. More can move through the file screenful 
by screenful, line by line, search forward for a string, or start again at the 
beginning of the file. It is generally the easiest way of viewing a file (1.8). 


The shell has a variable noclobber which may be set in the file login to 
prevent accidental destruction of files by the ‘>’ output redirection metasyn- 
tax of the shell (2.2, 2.5). 


The shell variable noglob is set to suppress the filename expansion of argu- 
ments containing the metacharacters “”’, ‘*’, ‘?’, ‘[’ and ‘]’ (3.6). 


The notify command tells the shell to report on the termination of a specific 
background job at the exact time it occurs as opposed to waiting until just 
before the next prompt to report the termination. The notify variable, if set, 
causes the shell to always report the termination of background jobs exactly 
when they occur (2.6). 


The onintr command is built into the shell and is used to control the action 
of a shell command script when an interrupt signal is received (3.9). 


Many commands in UNIX result in some lines of text which are called their 
output. This output is usually placed on what is known as the standard out- 
put which is normally connected to the user’s terminal. The shell has a syn- 
tax using the metacharacter ‘>’ for redirecting the standard output of a com- 
mand to a file (1.3). Using the pipe mechanism and the metacharacter ‘ it is 
also possible for the standard output of one command to become the stan- 
dard input of another command (1.5). Certain commands such as the line 
printer daemon p do not place their results on the standard output but 
rather in more useful places such as on the line printer (2.3). Similarly the 
write command places its output on another user’s terminal rather than its 
standard output (2.3). Commands also have a diagnostic output where they 
write their error messages. Normally these go to the terminal even if the 
standard output has been sent to a file or another command, but it is possi- 
ble to direct error diagnostics along with standard output using a special 
metanotation (2.5). 


The pushd command, which means ‘push directory’, changes the shell’s work- 
ing directory and also remembers the current working directory before the 
change is made, allowing you to return to the same directory via the popd 
command later without retyping its name (2.7). 


The shell has a variable path which gives the names of the directories in 
which it searches for the commands which it is given. It always checks first to 
see if the command it is given is built into the shell. If it is, then it need not 
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search for the command as it can do it internally. If the command is not 
builtin, then the shell searches for a file with the name given in each of the 
directories in the path variable, left to right. Since the normal definition of 
the path variable is 


path (. /usr/ucb /bin /usr/bin) 


the shell normally looks in the current directory, and then in the standard 
system directories ‘/usr/ucb’, ‘/bin’ and ‘/usr/bin’ for the named command 
(2.2). If the command cannot be found the shell will print an error diagnos- 
tic. Scripts of shell commands will be executed using another shell to inter- 
pret them if they have ‘execute’ permission set. This is normally true because 
a command of the form 


chmod 755 script 


was executed to turn this execute permission on (3.3). If you add new com- 
mands to a directory in the path, you should issue the command rehash 
(2.2). 


A list of names, separated by ‘/’ characters, forms a pathname. Each com- 
ponent, between successive ‘/’ characters, names a directory in which the next 
component file resides. Pathnames which begin with the character ‘/’ are 
interpreted relative to the root directory in the filesystem. Other pathnames 
are interpreted relative to the current directory as reported by pwd. The last 
component of a pathname may name a directory, but usually names a file. 


A group of commands which are connected together, the standard output of 
each connected to the standard input of the next, is called a pipeline. The 
pipe mechanism used to connect these commands is indicated by the shell 
metacharacter ‘?P (1.5, 2.3). 


The popd command changes the shell’s working directory to the directory 
you most recently left using the pushd command. It returns to the directory 
without having to type its name, forgetting the name of the current working 
directory before doing so (2.7). 


The part of a computer system to which each terminal is connected is called a 
port. Usually the system has a fixed number of ports, some of which are con- 
nected to telephone lines for dial-up access, and some of which are per- 
manently wired directly to specific terminals. 


The pr command is used to prepare listings of the contents of files with 
headers giving the name of the file and the date and time at which the file 
was last modified (2.3). 


The printenv command is used to print the current setting of variables in the 
environment (2.8). 


An instance of a running program is called a process (2.6). UNIX assigns each 
process a unique number when it is started — called the process number. 
Process numbers can be used to stop individual processes using the kill or 
stop commands when the processes are part of a detached background job. 


Usually synonymous with command; a binary file or shell command script 
which performs a useful function is often called a program. 


programmer’s manuals manual’u>(750u+1n) .br 


prompt 


Also referred to as the manual. See the glossary entry for ‘manual’. 


Many programs will print a prompt on the terminal when they expect input. 
Thus the editor ‘ex(1)’ will print a ‘:’ when it expects input. The shell 
prompts for input with ‘% ’ and occasionally with ‘? ’ when reading com- 
mands from the terminal (1.1). The shell has a variable prompt which may 
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be set to a different value to change the shell’s main prompt. This is mostly 
used when debugging the shell (2.8). 


The ps command is used to show the processes you are currently running. 
Each process is shown with its unique process number, an indication of the 
terminal name it is attached to, an indication of the state of the process 
(whether it is running, stopped, awaiting some event (sleeping), and whether 
it is swapped out), and the amount of CPU time it has used so far. The com- 
mand is identified by printing some of the words used when it was invoked 
(2.6). Shells, such as the csh you use to run the ps command, are not nor- 
mally shown in the output. 


The pwd command prints the full pathname of the current working direc- 
tory. The dirs builtin command is usually a better and faster choice. 


The quit signal, generated by a control-x is used to terminate programs which 
are behaving unreasonably. It normally produces a core image file (1.8). 


The process by which metacharacters are prevented their special meaning, 
usually by using the character “ in pairs, or by using the character ‘X, is 
referred to as quotation (1.7). 


The routing of input or output from or to a file is known as redirection of 
input or output (1.3). 


The rehash command tells the shell to rebuild its internal table of which 
commands are found in which directories in your path. This is necessary 
when a new program is installed in one of these directories (2.8). 


relative pathname 


repeat 
root 


RUBOUT 


scratch file 


script 


set 


A pathname which does not begin with a ‘/’ is called a relative pathname 
since it is interpreted relative to the current working directory. The first 
component of such a pathname refers to some file or directory in the working 
directory, and subsequent components between ‘/’ characters refer to direc- 
tories below the working directory. Pathnames that are not relative are 
called absolute pathnames (1.6). 


The repeat command iterates another command a specified number of times. 


The directory that is at the top of the entire directory structure is called the 
root directory since it is the ‘root’ of the entire tree structure of directories. 
The name used in pathnames to indicate the root is ‘/’. Pathnames starting 
with ‘/’ are said to be absolute since they start at the root directory. Root is 
also used as the part of a pathname that is left after removing the extension. 
See filename for a further explanation (1.6). 


The RUBOUT or DELETE key sends an interrupt to the current job. Most 
interactive commands return to their command level upon receipt of an inter- 
rupt, while non-interactive commands usually terminate, returning control to 
the shell. Users often change interrupt to be generated by {}C rather than 
DELETE by using the stty command. 


Files whose names begin with a ‘#’ are referred to as scratch files, since they 
are automatically removed by the system after a couple of days of non-use, or 
more frequently if disk space becomes tight (1.3). 


Sequences of shell commands placed in a file are called shell command 
scripts. It is often possible to perform simple tasks using these scripts 
without writing a program in a language such as C, by using the shell to selec- 
tively run other programs (3.3, 3.10). 

The builtin set command is used to assign new values to shell variables and 


to show the values of the current variables. Many shell variables have special 
meaning to the shell itself. Thus by using the set command the behavior of 
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the shell can be affected (2.1). 


Variables in the environment ‘environ(5)’ can be changed by using the setenv 
builtin command (2.8). The printenv command can be used to print the 
value of the variables in the environment. 


A shell is a command language interpreter. It is possible to write and run 
your own shell, as shells are no different than any other programs as far as 
the system is concerned. This manual deals with the details of one particular 
shell, called csh. 


See script (3.3, 3.10). 


A signal in UNIX is a short message that is sent to a running program which 
causes something to happen to that process. Signals are sent either by typing 
special control characters on the keyboard or by using the kill or stop com- 
mands (1.8, 2.6). 


The sort program sorts a sequence of lines in ways that can be controlled by 
argument flags (1.5). 


The source command causes the shell to read commands from a specified file. 
It is most useful for reading files such as .cshre after changing them (2.8). 
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See metacharacters and the appendix to this manual. 


We refer often to the standard input and standard output of commands. 
See input and output (1.3, 3.8). 


A command normally returns a status when it finishes. By convention a 
status of zero indicates that the command succeeded. Commands may return 
non-zero status to indicate that some abnormal event has occurred. The shell 
variable status is set to the status returned by the last command. It is most 
useful in shell commmand scripts (3.6). 


The stop command causes a background job to become suspended (2.6). 


A sequential group of characters taken together is called a string. Strings 
can contain any printable characters (2.2). 


The stty program changes certain parameters inside UNIX which determine 
how your terminal is handled. See ‘stty(1)’ for a complete description (2.6). 


The shell implements a number of substitutions where sequences indicated 
by metacharacters are replaced by other sequences. Notable examples of this 
are history substitution keyed by the metacharacter ‘!’ and variable substitu- 
tion indicated by ‘$’. We also refer to substitutions as expansions (3.4). 


A job becomes suspended after a STOP signal is sent to it, either by typing a 
control -z at the terminal (for foreground jobs) or by using the stop command 
(for background jobs). When suspended, a job temporarily stops running 
until it is restarted by either the fg or bg command (2.6). 


The switch command of the shell allows the shell to select one of a number of 
sequences of commands based on an argument string. It is similar to the 
switch statement in the language C (3.7). 


When a command which is being executed finishes we say it undergoes termi- 
nation or terminates. Commands normally terminate when they read an 
end-of-file from their standard input. It is also possible to terminate com- 
mands by sending them an interrupt or quit signal (1.8). The kill program 
terminates specified jobs (2.6). 


The then command is part of the shell’s ‘if-then-else-endif’? control construct 
used in command scripts (3.6). 
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The time command can be used to measure the amount of CPU and real time 
consumed by a specified command as well as the amount of disk i/o, memory 
utilized, and number of page faults and swaps taken by the command (2.1, 
2.8). 


The tset program is used to set standard erase and kill characters and to tell 
the system what kind of terminal you are using. It is often invoked in a 
.login file (2.1). 


The word tty is a historical abbreviation for ‘teletype’ which is frequently 
used in UNIX to indicate the port to which a given terminal is connected. The 
tty command will print the name of the tty or port to which your terminal is 
presently connected. 


The unalias command removes aliases (2.8). 


UNIX is an operating system on which csh runs. UNIX provides facilities 
which allow csh to invoke other programs such as editors and text formatters 
which you may wish to use. 


The unset command removes the definitions of shell variables (2.2, 2.8). 
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See variables and expansion (2.2, 3.4). 


Variables in csh hold one or more strings as value. The most common use of 
variables is in controlling the behavior of the shell. See path, noclobber, and 
ignoreeof for examples. Variables such as argu are also used in writing shell 
programs (shell command scripts) (2.2). 


The verbose shell variable can be set to cause commands to be echoed after 
they are history expanded. This is often useful in debugging shell scripts. 
The verbose variable is set by the shell’s —v command line option (3.10). 


The wc program calculates the number of characters, words, and lines in the 
files whose names are given as arguments (2.6). 


The while builtin control construct is used in shell command scripts (3.7). 


A sequence of characters which forms an argument to a command is called a 
word. Many characters which are neither letters, digits, ‘—’, ‘.’ nor ‘/? form 
words all by themselves even if they are not surrounded by blanks. Any 
sequence of characters may be made into a word by surrounding it with “”’ 
characters except for the characters “’ and ‘!’ which require special treatment 
(1.1). This process of placing special characters in words without their special 
meaning is called quoting. 


working directory 


write 


At any given time you are in one particular directory, called your working 
directory. This directory’s name is printed by the pwd command and the 
files listed by l/s are the ones in this directory. You can change working direc- 
tories using chdir. 


The write command is used to communicate with other users who are logged 
in to UNIX. 
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PART 5: DOCUMENT PREPARATION 


This part includes articles on the features and utilities of the ULTRIX-32 system that will 
help you to prepare written information for publication. Seven of the articles deal with nroff 
and troff, the text formatters that convert unformatted text into a formal document ready for 
output on a printer or typesetter. Nroff produces output printable on a typewriter-like termi- 
nal, line printer, or terminal screen. Troff prepares output for a phototypesetter. Five other 
articles explain the uses of eqn, tbl, and refer. These are utilities that cooperate with the text 
processors to produce mathematical equations, tables, and bibliographical references in the 
text formatted by nroff or troff. An additional article describes the style and diction programs, 
tools that provide criteria for evaluating written material. 


Nroff and Troff 


Formatting a document on the ULTRIX-32 system is a two-stage process. In stage one, you 
create or change a file using vi or one of the other editors. This file should contain the text to 
be processed and commands to the text formatter. The commands tell the formatter how to 
treat the text, for example how wide to make the margins, when to start a new paragraph, and 
when to leave the right margin unjustified. In stage two, you give a command to the shell tel- 
ling nroff or troff to process the text in the file you created. Nroff and troff are compatible, so 
that one text file can generally serve as a source for both line printer output and typesetter 
output. 


The text processors allow you to define exactly how you want your text to look. However, 
developing a format that is consistent throughout a document involves repeating many details 
(consider page headers and multicolumn formats, for example). ULTRIX-32 includes two 
macro packages (-ms and -me) that specify many details and simplify the specification of 
other details for you. These macro packages serve to reduce your direct contact with nroff 
and troff, making the text formatting process easier. The articles by Lesk, “Typing Docu- 
ments on the UNIX System: Using the -ms Macros with TROFF and NROFF,” and Tuthill, 
“A Revised Version of -ms,” tell what there is to know about using -ms. “A Guide to Prepar- 
ing Documents with -ms,” also by Lesk, gives comprehensive examples. 


The topics include: 
¢ Cover sheet format such as author, title, abstract 
¢ Page headings 
¢ Multicolumn format 
¢ Section headings 
¢ Paragraph control 
¢ Italics 
¢ Footnotes 
¢ Specifying dates 
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e Changing default values 
e Using accent marks 
« Automatic footnote numbering 


The Lesk article is readable and arranged in a tutorial format. The Tuthill article is a brief 
supplement. 


“Writing Papers with NROFF Using -ME,” by Allman, covers many of the same topics. This 
article is also tutorial. It provides good explanations and examples. 


The “ME Reference Manual,” also by Allman, lists all features of the -me macro package. 
Read it if you want greater flexibility than is allowed by the procedures shown in the first All- 
man article. 


The “NROFF/TROFF User’s Manual,” by Ossanna, is appropriate for users already familiar 
with the macro packages who want to develop their own nroff or troff macros or macro pack- 
ages. The first part of this article lists the command line options for the text formatters, all 
nroff and troff commands, escape sequences, and predefined registers. The second part 
defines in detail the rules that govern use of the text formatters. A set of examples completes 
the article. 


“A TROFF Tutorial,” by Kernighan, concentrates on features of troff that are specific to 
typesetting such as: 


e Point sizes 

e Font changes 

e Special characters ~.. 

* Horizontal and vertical motions 


The information in this article is appropriate for users who want more flexibility in typesetter 
control than they can get with the -ms and -me macro packages. 


Preprocessors 


Three preprocessor utilities expand the text formatting capabilities of the ULTRIX-32 sys- 
tem: 


eqn lets you typeset mathematical expressions. 
tbl helps you to format tables easily. 
refer helps you to create bibliographical references. 


These utilities process notation for mathematical expressions, tables, and bibliographical 
descriptions, turning them into sequences of commands for nroff or troff. 


This part includes two articles on eqn by Kernighan and Cherry. “A System for Typesetting 
Mathematics” outlines the design goals and capabilities of eqn. The second article, “Typeset- 
ting Mathematics - User’s Guide,” shows how to make eqn produce: 


e Equations 

e Special symbols 

¢ Greek letters 

e Subscripts and superscripts 
e Braces 

e =6Piles 

¢ Matrices 

¢ Local motions 


Read this second article for practical information on using eqn. Read the first eqn article, “A 
System for Typesetting Mathematics,” if you want to know more about the background of 


eqn. 
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eqn. 


“TBL - A Program to Format Tables,” by Lesk, serves as a reference and a tutorial. The first 
part of the article lists rules for using tbl to create tables. The remainder of the article con- 
sists of examples showing sequences of commands supplied to tbl and the resulting tables. 


Three of the articles in this part deal with utilities related to bibliographies and indexing. 
Using refer to make bibliographical references in a text requires three steps: 


1 You must build a data base that describes the items that can be referenced. Each 
entry in the data base identifies the publication by several categories such as: 


¢ Author 

e Title 

e Issuer (publisher) 

e City where published 
e Date of publication 


Enter this information by running the addbib utility. Note that you can list the 
entire data base, sorted by author and date, by running the sortbib and roffbib utili- 
ties. 


2 As you write a new text to be processed by nroff or troff, you can create a standard 
bibliographical reference to an item contained in the data base by specifying one or 
two key fields of the data base item. 


3 Run the refer and nroff or troff utilities to process the text. 
Tuthill’s article, “Refer - A Bibilography System,” is the most readable and useful of the three 
articles on refer. 


The Lesk articles, “Some Applications of Inverted Indexes on the UNIX System” and 
“Updating Publication Lists,” deal with indexing and bibliographical referencing. The exam- 
ples that relate to refer may be useful, if you read the Tuthill article first. The explanations 
of indexing are hard to follow. If you must use the searching and indexing utilities, you may 
want help from someone who uses this software to supplement the Lesk articles. 


Style and Diction 
The style and diction programs can help you evaluate and refine writing skills. The texts to 
be evaluated can be in a file on the system. The article entitled “Writing Tools - The Style 
and Diction Programs,” by Cherry and Vesterman, explains the yardsticks that style uses to 
measure: 

e Readability levels 

e Sentence structure 

e Word usage (by parts of speech) 

e Sentence openers 


The article also shows how to use the diction program to identify phrases that are frequently 
misused or wordy. You can use the explain program together with diction to find substitutes 
for the objectionable phrases. 


Summary 


The articles on -ms and -me (choose one) will help you to get started using nroff and troff. 
Eqn, tbl, and refer work with nroff and troff to simplify typesetting mathematical expressions, 
formatting tables, and making bibliographical references. The style and diction programs will 
help you to evaluate what you write. 


Typing Documents on the UNIX System 5-5 


Typing Documents on the UNIX System: 
Using the —ms Macros with Troff and Nroff 


M. E. Lesk 


Bell Laboratories 
Murray Hill, New Jersey 07974 


Introduction. This memorandum describes a package of commands to produce papers 
using the troff and nroff formatting programs on the UNIX system. As with other roff -derived 
programs, text is prepared interspersed with formatting commands. However, this package, 
which itself is written in troff commands, provides higher-level commands than those pro- 
vided with the basic troff program. The commands available in this package are listed in 
Appendix A. 


Text. Type normally, except that instead of indenting for paragraphs, place a line read- 
ing “.PP” before each paragraph. This will produce indenting and extra space. 


Alternatively, the command .LP that was used here will produce a left-aligned (block) para- 
graph. The paragraph spacing can be changed: see below under “Registers.” 


Beginning. For a document with a paper-type cover sheet, the input should start as fol- 
lows: 


[optional overall format .RP — see below] 

.TL 

Title of document (one or more lines) 

AU 

Author(s) (may also be several lines) 

AI 

Author’s institution(s) 

.AB 

Abstract; to be placed on the cover sheet of a paper. 
Line length is 5/6 of normal; use .ll here to change. 
.AE (abstract end) 

text ... (begins with .PP, which see) 


To omit some of the standard headings (e.g. no abstract, or no author’s institution) just omit 
the corresponding fields and command lines. The word ABSTRACT can be suppressed by writ- 
ing “.AB no” for “.AB”. Several interspersed .AU and .AI lines can be used for multiple 
authors. The headings are not compulsory: beginning with a .PP command is perfectly OK 
and will just start printing an ordinary paragraph. Warning: You can’t just begin a docu- 
ment with a line of text. Some —ms command must precede any text input. When in doubt, 
use .LP to get proper initialization, although any of the commands .PP, .LP, .TL, .SH, .NH is 
good enough. Figure 1 shows the legal arrangement of commands at the start of a document. 


Cover Sheets and First Pages. The first line of a document signals the general format 
of the first page. In particular, if it is ”.RP” a cover sheet with title and abstract is prepared. 
The default format is useful for scanning drafts. 


UNIX is a Trademark of Bell Laboratories 
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In general —ms is arranged so that only one form of a document need be stored, contain- 
ing all information; the first command gives the format, and unnecessary items for that for- 
mat are ignored. 


Warning: don’t put extraneous material between the .TL and .AE commands. Process- 
ing of the titling items is special, and other data placed in them may not behave as you 
expect. Don’t forget that some —ms command must precede any input text. 


Page headings. The —ms macros, by default, will print a page heading containing a 
page number (if greater than 1). A default page footer is provided only in nroff, where the 
date is used. The user can make minor adjustments to the page headings/footings by 
redefining the strings LH, CH, and RH which are the left, center and right portions of the 
page headings, respectively; and the strings LF, CF, and RF, which are the left, center and 
right portions of the page footer. For more complex formats, the user can redefine the macros 
PT and BT, which are invoked respectively at the top and bottom of each page. The margins 
(taken from registers HM and FM for the top and bottom margin respectively) are normally 1 
inch; the page header/footer are in the middle of that space. The user who redefines these 
macros should be careful not to change parameters such as point size or font without resetting 


them to default values. 


Multi-column formats. If you place 
the command “.2C” in your document, the 
document will be printed in double column 
format beginning at that point. This 
feature is not too useful in computer termi- 
nal output, but is often desirable on the 
typesetter. The command “.1C” will go 
back to one-column format and also skip to 
a new page. The “.2C” command is actu- 
ally a special case of the command 


.MC [column width [gutter width]] 


which makes multiple columns with the 
specified column and gutter width; as many 
columns as will fit across the page are used. 
Thus triple, quadruple, ... column pages can 
be printed. Whenever the number of 
columns is changed (except going from full 
width to some larger number of columns) a 
new page is started. 


Headings. To produce a_ special 
heading, there are two commands. If you 
type 

.NH 


type section heading here 
may be several lines 


you will get automatically numbered section 
headings (1, 2, 3, ...), in boldface. For 
example, 


.NH 
Care and Feeding of Department Heads 


produces 


1. Care and Feeding of Department 
Heads 


Alternatively, 


SH 
Care and Feeding of Directors 


will print the heading with no number 
added: 


Care and Feeding of Directors 


Every section heading, of either type, 
should be followed by a paragraph begin- 
ning with .PP or .LP, indicating the end of 
the heading. Headings may contain more 
than one line of text. 


The .NH command also supports 
more complex numbering schemes. If a 
numerical argument is given, it is taken to 
be a “level” number and an appropriate 
sub-section number is generated. Larger 
level numbers indicate deeper sub-sections, 
as in this example: 


.NH 

Erie-Lackawanna 

.NH 2 

Morris and Essex Division 
.NH 3 

Gladstone Branch 

.NH 3 

Montclair Branch 

.NH 2 

Boonton Line 


generates: 
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2. Erie-Lackawanna 

2.1. Morris and Essex Division 
2.1.1. Gladstone Branch 

2.1.2. Montclair Branch 


2.2. Boonton Line 


An explicit “.NH 0” will reset the 
numbering of level 1 to one, as here: 


.NH 0 
Penn Central 


1. Penn Central 


Indented paragraphs. (Paragraphs 
with hanging numbers, e.g. references.) The 
sequence 


AP [1] 

Text for first paragraph, typed 
normally for as long as you would 
like on as many lines as needed. 
IP [2] 

Text for second paragraph, ... 


produces 


[1] Text for first paragraph, typed nor- 
mally for as long as you would like on 
as many lines as needed. 


[2] Text for second paragraph, ... 


A series of indented paragraphs may be fol- 
lowed by an ordinary paragraph beginning 
with .PP or .LP, depending on whether you 
wish indenting or not. The command .LP 
was used here. 


More sophisticated uses of .IP are also 
possible. If the label is omitted, for exam- 
ple, a plain block indent is produced. 


IP . 

This material will 

just be turned into a 

block indent suitable for quotations or 
such matter. 

.LP 


will produce 


This material will just be turned into 
a block indent suitable for quotations 
or such matter. 


If a non-standard amount of indenting is 
required, it may be specified after the label 


(in character positions) and will remain in 
effect until the next .PP or .LP. Thus, the 
general form of the .[P command contains 
two additional fields: the label and the 
indenting length. For example, 


IP first: 9 

Notice the longer label, requiring larger 
indenting for these paragraphs. 

.IP second: 

And so forth. 

.LP 


produces this: 


first: Notice the longer label, requiring 
larger indenting for these para- 
graphs. 


second: And so forth. 


It is also possible to produce multiple 
nested indents; the command .RS indicates 
that the next .JP starts from the current 
indentation level. Each .RE will eat up one 
level of indenting so you should balance .RS 
and .RE commands. The .RS command 
should be thought of as “move right” and 
the .RE command as “move left”. As an 
example 


AIP 1. 

Bell Laboratories 
.RS 

AP 1.1 
Murray Hill 
AP 1.2 
Holmdel 
IP 1.3 
Whippany 
RS 

AP 1.3.1 
Madison 
RE 

AP 1.4 
Chester 
.RE 

.LP 


will result in 
1. Bell Laboratories 
1.1 Murray Hill 
1.2 Holmdel 
1.8 Whippany 
1.3.1 Madison 
1.4 Chester 
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All of these variations on .LP leave the 
right margin untouched. Sometimes, for 
purposes such as setting off a quotation, a 
paragraph indented on both right and left 
is required. 


A single paragraph like this is 
obtained by preceding it with 
.QP. More complicated material 
(several paragraphs) should be 
bracketed with .QS and .QE. 


Emphasis. To get italics (on the 
typesetter) or underlining (on the terminal) 
say 


Jl 
as much text as you want 


can be typed here 
.R 


as was done for these three words. The .R 
command restores the normal (usually 
Roman) font. If only one word is to be ital- 
icized, it may be just given on the line with 
the .I command, 


.I word 


and in this case no .R is needed to restore 
the previous font. Boldface can be pro- 
duced by 


.B 

Text to be set in boldface 
goes here 

R 


and also will be underlined on the terminal 
or line printer. As with .I, a single word 
can be placed in boldface by placing it on 
the same line as the .B command. 


A few size changes can be specified 
similarly with the commands .LG (make 
larger), .SM (make smaller), and .NL 
(return to normal size). The size change is 
two points; the commands may be repeated 
for increased effet (here one .NL canceled two 
.SM commands). 


If actual underlining as opposed to 
italicizing is required on the typesetter, the 
command 


.UL word 
will underline a word. There is no way to 
underline multiple words on the typesetter. 


Footnotes. Material placed between 
lines with the commands .FS (footnote) and 


.FE (footnote end) will be collected, remem- 
bered, and finally placed at the bottom of 
the current page*. By default, footnotes 
are 11/12th the length of normal text, but 
this can be changed using the FL register 
(see below). 


Displays and Tables. To prepare 
displays of lines, such as tables, in which 
the lines should not be re-arranged, enclose 
them in the commands .DS and .DE 


.DS 

table lines, like the 
examples here, are placed 
between .DS and .DE 
.DE 


By default, lines between .DS and .DE are 
indented and left-adjusted. You can also 
center lines, or retain the left margin. 
Lines bracketed by .DS C and .DE com- 
mands are centered (and not re-arranged); 
lines bracketed by .DS L and .DE are left- 
adjusted, not indented, and not re- 
arranged. A plain .DS is equivalent to .DS 
I, which indents and left-adjusts. Thus, 


these lines were preceded 
by .DS C and followed by 
a .DE command; 


whereas 


these lines were preceded 
by .DS L and followed by 
a .DE command. 


Note that .DS C centers each line; there is a 
variant .DS B that makes the display into a 
left-adjusted block of text, and then centers 
that entire block. Normally a display is 
kept together, on one page. If you wish to 
have a long display which may be split 
across page boundaries, use .CD, .LD, or 
.ID in place of the commands .DS C, .DS L, 
or -DS I respectively. An extra argument to 
the .DS I or .DS command is taken as an 
amount to indent. Note: it is tempting to 
assume that .DS R will right adjust lines, 
but it doesn’t work. 


Boxing words or lines. To draw rec- 
tangular boxes around words the command 


.BX word 


* Like this. 
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will print as shown. The boxes will 
not be neat on a terminal, and this should 
not be used as a substitute for italics. 


Longer pieces of text may be boxed by 
enclosing them with .B1 and .B2: 


.B1 
text... 
.B2 


as has been done here, 

Keeping blocks together. If you wish 
to keep a table or other block of lines 
together on a page, there are “keep - 
release” commands. If a block of lines pre- 
ceded by .KS and followed by .KE does not 
fit on the remainder of the current page, it 
will begin on a new page. Lines bracketed 
by .DS and .DE commands are automati- 
cally kept together this way. There is also 
a “keep floating” command: if the block to 
be kept together is preceded by .KF instead 
of .KS and does not fit on the current page, 
it will be moved down through the text 
until the top of the next page. Thus, no 
large blank space will be introduced in the 
document. 


Nroff/Troff commands. Among the 
useful commands from the basic formatting 
programs are the follewing. They all work 
with both typesetter and computer terminal 
output: 


.bp - begin new page. 

.br - “break”, stop running text 
from line to line. 

.sp n - insert n blank lines. 

na - don’t adjust right margins. 


Date. By default, documents pro- 
duced on computer terminals have the date 
at the bottom of each page; documents pro- 
duced on the typesetter don’t. To force the 
date, say “.DA”. To force no date, say 
“ND”. To lie about the date, say “.DA 
July 4, 1776” which puts the specified date 
at the bottom of each page. The command 


.ND May 8, 1945 


in ”.RP” format places the specified date on 
the cover sheet and nowhere else. Place 
this line before the title. 


Signature line. You can obtain a sig- 
nature line by placing the command .SG in 
the document. The authors’ names will be 
output in place of the .SG line. An argu- 





ment to .SG is used as a_ typing 
identification line, and placed after the sig- 
natures. The .SG command is ignored in 
released paper format. 


Registers. Certain of the registers 
used by —ms can be altered to change 
default settings. They should be changed 
with .nr commands, as with 


nr PS 9 


to make the default point size 9 point. If 
the effect is needed immediately, the nor- 
mal troff command should be used in addi- 
tion to changing the number register. 


Defines Takes 
effect 
next para. 10 
next para. 12 pts 
next para. 6\ ne 
next para. 6\ ” 
next para. 0.3 VS 
next para. 5 ens 


Register Default 
PS point size 
VS _ line spacing 
LL line length 
LT title length 
PD para. spacing 
PI para. indent 


FL footnote length next FS 11/12 LL 
CW column width next 2C 7/15 LL 
GW intercolumn gap next 2C 1/15 LL 
PO page offset next page 26/27\ m2 
HM top margin next page  1)\”’ 


FM bottom margin next page 1\” 


You may also alter the strings LH, CH, and 
RH which are the left, center, and right 
headings respectively; and similarly LF, CF, 
and RF which are strings in the page footer. 
The page number on output is taken from 
register PN, to permit changing its output 
style. For more complicated headers and 
footers the macros PT and BT can be 
redefined, as explained earlier. 


Accents. To simplify typing certain 
foreign words, strings representing common 
accent marks are defined. They precede 
the letter over which the mark is to appear. 
Here are the strings: 


Input Output Input Output 
\*’e e Fa a 
\Fe e Ce e 
\eru t \c ,C 
¥*e é 


Use. After your document is 
prepared and stored on a file, you can print 
it on a terminal with the command* 


* If .2C was used, pipe the nroff output through 
col; make the first line of the input “.pi 
/usr/bin/col.” 
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nroff —ms file 


and you can print it on the typesetter with 
the command 


troff —ms file 


(many options are possible). In each case, 
if your document is stored in several files, 
just list all the filenames where we have 
used “file”. If equations or tables are used, 
eqn and/or tbl must be invoked as prepro- 
cessors. 


References and further study. If 
you have to do Greek or mathematics, see 
eqn [1] for equation setting. To aid eqn 
users, —ms provides definitions of .EQ and 
-EN which normally center the equation 
and set it off slightly. An argument on .EQ 
is taken to be an equation number and 
placed in the right margin near the equa- 
tion. In addition, there are three special 
arguments to EQ: the letters C, I, and L 
indicate centered (default), indented, and 
left adjusted equations, respectively. If 
there is both a format argument and an 
equation number, give the format argument 
first, as in 


.EQ L (1.8a) 


for a left-adjusted equation numbered 
(1.3a). — 


Similarly, the macros .TS and .TE are 
defined to separate tables (see [2]) from 
text with a little space. A very long table 
with a heading may be broken across pages 
by beginning it with .TS H instead of .TS, 
and placing the line .TH in the table data 
after the heading. If the table has no head- 
ing repeated from page to page, just use the 
ordinary .TS and .TE macros. 


To learn more about troff see [3] for a 
general introduction, and [4] for the full 
details (experts only). Information on 
related UNIX commands is in [5]. For jobs 
that do not seem well-adapted to —ms, con- 
sider other macro packages. It is often far 
easier to write a specific macro packages for 
such tasks as imitating particular journals 
than to try to adapt —ms. 


Acknowledgment. Many thanks are 
due to Brian Kernighan for his help in the 
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manual. 
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Appendix A 
List of Commands 


1C Return to single column format. LG Increase type size. 
2C Start double column format. LP Left aligned block paragraph. 
AB _ Begin abstract. 
AE _ End abstract. 
Al Specify author’s institution. 
AU _ Specify author. ND __ Change or cancel date. 
B Begin boldface. NH _ Specify numbered heading. 
DA Provide the date on each page. NL __ Return to normal type size. 
DE __ End display. PP Begin paragraph. 
DS __ Start display (also CD, LD, ID). 
EN _ End equation. R Return to regular font (usually Roman). 
EQ Begin equation. RE End one level of relative indenting. 
FE End footnote. RP Use released paper format. 
FS Begin footnote. RS Relative indent increased one level. 
SG Insert signature line. 
I Begin italics. SH Specify section heading. 
SM __ Change to smaller type size. 
IP Begin indented paragraph. TL Specify title. 
KE _ Release keep. 
KF _ Begin floating keep. UL Underline one word. 
KS _ Start keep. 


Register Names 


t 


The following register names are used by —ms internally. Independent use of these 
names in one’s own macros may produce incorrect output. Note that no lower case letters are 
used in any —ms internal name. 


Number registers used in —ms 


: DW GW HM IQ LL NA OJ PO Ws TV 
#T EF H1 HT IR LT NC PD PQ TB VS 
1T FL H3 IK KI MM NF PF PX TD YE 
AV FM H4 IM Ll MN NS PI RO TN YY 
CW FP H5 IP LE MO OI PN ST TQ ZN 
String registers used in —ms 
: A5 CB DW EZ I KF MR Rl RT TL 
AB CC DY FA Il KQ ND R2 SO T™ 
AE CD El FE I2 KS NH R3 S1 TQ 
- AI CF K2 FJ I3 LB NL R4 S2 TS 
AU CH E3 FK 14 LD NP R5 SG TT 
; B CM K4 FN 15 LG OD RC SH UL 
1C BG CS E5 FO ID LP OK RE SM WB 
2C BT CT EE FQ IE ME PP RF SN WH 
Al C D EL FS IM MF PT RH SY WT 
A2 Cl DA EM FV IP MH PY RP TA XD 
A3 C2 DE EN FY IZ MN QF RQ TE XF 
A4 CA DS EQ HO KE MO R RS TH XK 
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Order of Commands in Input 





Figure 1 


A Guide to Preparing 
Documents with —ms 


M. E. Lesk 


Bell Laboratories August 1978 





This guide gives some simple examples of do- 
cument preparation on Bell Labs computers, 
emphasizing the use of the ms macro pack- 
age. It enormously abbreviates information in 
Typing Documents on UNIX and GCOS, by 
M. E. Lesk; 
2. Typesetting Mathematics ~— User's Guide, 
by B. W. Kernighan and L. L. Cherry; and 
3. Tbl — A Program to Format Tables, by M. 
E. Lesk. 
These memos are all included in the UNIX 
Programmer's Manual, Volume 2. The new 
user should also have A Tutorial Introduction to 
the UNIX Text Editor, by B. W. Kernighan. 


For more detailed information, read Advanced 
Editing on UNIX and A Troff Tutorial, by B. W. 
Kernighan, and (for experts) Nroff/Troff Refer- 
ence Manual by J. F. Ossanna. Information on 
related commands is found (for UNIX users) in 
UNIX for Beginners by B. W. Kernighan and 
the UNIX Programmer's Manual by K. Thomp- 
son and D. M. Ritchie. 


Contents 
BS TNE creeks Sa 4 ene ate eas 2 
A released paper ............. 3 
An internal memo, and headings ... 4 
Lists, displays, and footnotes ..... 5 
Indents, keeps, and double column . 6 
Equations and registers ......... 7 
Tables and usage ............. 8 


Throughout the examples, input is shown in 
this Helvetica sans serif font 

while the resulting output is shown in 
this Times Roman font. 


UNIX Document no. I111 
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Commands for a TM 


.TM 1978-563 99999 99999-11 

‘ND April 1, 1976 

TL 

The Role of the Allen Wrench in Modern 
Electronics 

AU "MH 2G-111" 2345 

J. Q. Pencilpusher 

AU "MH 1K-222”" 5432 

X. Y. Hardwired 

Al 

.MH 

OK 

Tools 

Design 

AB 

This abstract should be short enough to 
fit on a single page cover sheet. 

It must attract the reader into sending for 
the compiete memorandum. 

AE 

€S$ 10212567 

NH 

Introduction. 


Now the first paragraph of actual text ... 


Last line of text. 

SG MH-1234-JQP/X YH-unix 
NH 

References ... 


Commands not needed in a particular format are ig- 
nored. 


(ay) Bell Laboratories 


This informanon is for empiovees of Bell Laboratories. (GE! 13.%43) 


Cover Sheet for TM 


Tide- The Role of the Allen Wrench 
in Modern Electronics 


Date-April 1, 1976 


TM- 1978-563 
Other Keywords- Tools 
Design 


Author Location Ext. Charging Case- 99999 
J. Q. Pencilpusher MH 2G-111 2345 Filing Case- 999998 
X. Y. Hardwired MH 1K-222 5432 


ABSTRACT 
This abstract should be short enough to 
fit on a single page cover sheet. [t must 


attract the reader into sending for the com- 
plete memorandum. 


Pages Text 10 Other 2 Total 12 
No. Figures 5 No. Tables 6 No. Refs. 7 


E-1932-U (6-73) SEE REVERSE SIDE FOR DISTRIBUTION LiST 
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A Released Paper with Mathematics 


.£OQ 
delim $$ 
.EN 
AP 


.. (as for a TM) 


C$ 10212567 
.NH 

Introduction 

PP 


The ae to the torque handle equation 

£Q (1 

sum from 0 to inf F (x subi) = G (x) 

.EN 

is found with the transformation § x = rho over 
theta S$ where $ rho = G prime (x) S and SthetaS 
is derived ... 


The Role of the Allen Wrench 
in Modern Electronics 


J. Q. Penciipusher 
X. Y. Hardwired 


Beil Laboratories 
Murray Hill, New Jersey 07974 


ABSTRACT 


This abstract should be short enough to fit on a 
single page cover sheet. It must attract the 
reader into sending for the complete memoran- 
dum. 


April 1, 1976 


The Role of the Allen Wrench 
in Modern Electronics 


J. Q. Penctipusher 
X. Y. Hardwired 


Beil Laboratories 
Murray Hill, New Jersey 07974 


1. {atreduction 
The solution to the torque handle equation 


> Flx))=Gtx) (1) 
Q 


is found with the transformation x = where p=G'(x) and 
9 is derived from weil-known principles. 












An Internal Memorandum 


AM 
.NO January 24, 1956 
TL 


The 1956 Consent Decree 
AU 
Abie, Baker & 
Charley, Attys. 
pp 


Plaintiff, United States of America, having fiied 
its complaint herein on January 14, 1949; the 
defendants having appeared and filed their 
answer to such complaint denying the 
substantive allegations thereof; and the parties, 
by their attorneys, ... 





© 


Beil Laboratories 
Subject: The 1956 Consent Decree date: January 24, 195¢ 


from: Able, Baker & 
Charley, Attys. 


Plaintiff, United States of America. having filed its com 
plaint herein on January 14, 1949; the defendants havin 
appeared and filed their answer to such complaint denyin 
the substantive ailegations thereof, and the parties. by the: 
attorneys, having severaily consented to the entry of th 
Final Judgment. without (riai or adjudication of any issue 
of fact or law herein and without this Final Judgment cor 
stituting any evidence or admission by any party in respe: 
of any such issues; 


Now, therefore before anv testimony has been take 
herein, and without trial or adjudication of any issue of fa 
or law herein, and upon the consent of ail parties hereto, 
is hereby 

Ordered, adjudged and decreed as follows: 

I. (Sherman Act] 

This Court has jurisdiction of the subject matter here: 
and of ail the parties hereto. The complaint states a clail 
upon which relief may be granted against each of tt 
defendants under Sections |, 2 and 3 of the Act « 
Congress of July 2, 1890. entitled “An act to protect trac 
and commerce against unlawful restraints and monopi 
lies."” commonly known as the Sherman Act, as amended. 


Il. [Definitions] 
For the purposes of this Final Judgment: 


(a) ‘*Western’’ shail mean the defendant Western Ele 
tric Company, Incorporated. 


Other formats possible (specify before TL) are: .M 
(‘memo for record’*). .MF (“*memo for file’’), .E 
(“Sengineer’s notes’) and .TR (Computing Scien 
Tech. Report). 


Headings 
.NH SH 
Introduction. Appendix | 
a1 =) 


text text text text text text 


1. Introduction 
t@Xt text (ext 


Appendix | 
T@XU (EXE text 


A Simple List 


JAP 4. 
J. Pencilpusher and X. Hardwired, 
I 


A New Kind of Set Screw, 


R 

Proc. IEEE 

B75 

(1976), 23-255. 

IP 2. 

H. Nails and R. Irons, 

A 

Fasteners for Printed Circuit Boards, 
R 


Proc. ASME 

.B 23 

(1974), 23-24. 

LP (terminates list) 


1. J. Pencilpusher and X. Hardwired, 4 New Kind 
of Set Screw, Proc. IEEE 75 (1976), 23-255. 

2. H. Nails and R. Irons, Fasteners for Printed Cir- 
cuit Boards, Proc. ASME 23 (1974), 23-24. 


Displays 
text text text text text text 
DS. 
and now 


for something 

completely different 

DE 

text text text text text text 


hoboken harrison newark roseville avenue grove 
street east orange brick church orange highland ave- 
nue mountain station south orange mapliewood 
miliburn short hills summit new providertce 


and now 
for something 
completely different 


murray hiil berkeley heights gillette stirling milling- 
ton lyons basking ridge bernardsville far hills 
peapack gladstone 


Options: .DS L: left-adjust; .DS C: line-by-line 
center; .OS B: make block, then center. 


Footnotes 


Among the most important occupants 

of the workbench are the long-nosed pliers. 
Without these basic tools’ 

FS 

* As first shown by Tiger & Leopard 

(1975). 

FE 

few assemblies could be completed. They may 
lack the popular appeal of the sledgehammer 


Among the most important occupants of the work- 
bench are the long-nosed pliers. Without these basic 
tools* few assemblies could be completed. They 
may lack the popular appeal of the sledgehammer 


* As first shown by Tiger & Leopard (1975). 
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Multiple Indents 


This is ordinary text to point out 


the margins of the page. 


JP 1. 
First level item 


IP a) 
Second level. 
IP b) 


Continued here with another second 
level item, but somewhat longer. 


RE 
AP 2. 


Return to previous value of the 


indenting at this point. 
AP 3. 

Another 

line. 


This is ordinary text to point out the margins of the 


page. 
1. First level item 
a) Second level. 


b) Continued here with another second level 
item, but somewhat longer. 
2. Return to previous value of the indenting at this 


point. 
3. Another line. 


Keeps 


Lines bracketed by the following commands are kept 
together, and will appear entirely on one page: 


KS not moved 
KE through text 


KF may float 
.KE in text 


Double Column 


TL 


The Declaration of independence 


2C 
PP 


When in the course of human events, it becomes 
necessary for one people to dissolve the 
political bonds which have connected them with 
another, and to assume among the powers of the 
earth the separate and equal station to which 
the laws of Nature and of Nature’s God entitle 
them, a decent respect to the opinions of 


The Declaration of Independence 


When in the course of 
human events, it be- 
comes necessary for one 
people to dissolve the 
political bonds which 
have connected them 
with another, and to as- 
sume among the powers 
of the earth the separate 
and equal station § to 
which the laws of Nature 
and of Nature’s God en- 
tile them, a _ decent 
respect to the opinions 
of mankind requires that 


they should declare the 
causes which impel them 
to the separation. 

We hold these truths 
to be self-evident, that 
ail men are created 
equal, that they are en- 
dowed by their creator 
with certain unalienable 
rights, that among these 
are life, liberty, and the 
pursuit of happiness. 
That to secure these 
rights, governments are 
instituted among men, 
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Equations 


A displayed equation is marked 

with an equation number at the right margin 

by adding an argument to the EQ line: 

EQ (1.3) 

X sup 2 over a sup 2 “=~ sart {p z sup 2 +qz+r} 
EN 


A displayed equation is marked with an equation 
number at the right margin by adding an argument 
to the EQ line: 


2 
a] = Vpz't+qitr (1.3) 


EQ | (2.2a) 

bold V bar sub nu~=“left [ pile {a above b above 
c } right ] + left [ matrix { col { A(11) above . 
above .} col {. above . above .} col {. above . 
above A(33) }} right ] cdot left [ pile { alpha 
above beta above gamma } right ] 




















.EN 
= aj {A(Il) . a 
V,={ol+i. : . ip (2.2a) 

c . A(33)} ly 

EQ L 

F hat ( chi) ~ mark = ~|del V|sup 2 

.EN 

LQ L 


lineup =~ {left ( {partial V} over (partial x} right ) 
} sup 2 + { left ( {partial V} over (partial y} right 
)} sup 2-7-7" lambda -> inf 

.EN 


Fy) =|(T¥/? 


2 2 
a La Oe 5 = 
ax as av oe 














$a dot S, $b dotdotS, §$ xi tilde times y vecS: 
a, b, &x7. 


See also the equations in the second table, panel 8. 


(with delim SS on, see panei 3). 


Some Registers You Can Change 


Line length Paragraph spacing 
ene LL 7i .ar PDO 
Title length Page offset 
ene LT 7i nr PO 0.5: 
Point size Page heading 
.nr PS 9 .ds CH Appendix 
Vert : (center) 
ertical spacing ds RH 7-25-76 
.ar VS tl ; 
(right) 
Column width .ds LH Private 
nr CW 3i (left) 
Intercolumn spacing Page footer 
ar GW .5i .ds CF Draft 
Margins — head and foot SLE 
‘ar HM .75i as Re Sat 
ne FM .75i Page numbers 
Paragraph indent nr % 3 


nr PI 2n 







Tables 
TS (@D indicates a tab) 
alibox; 
css AT&T Common Stock ! 
ecc a 
ean Price ; Dividend 


AT&T Common Stock 
Year @Price D Dividend 
1971 241-54 DS$2.60 
2041-5402.70 
3046-55 © 2.87 
4040-53 03.24 

§ ©45-5203.40 
6®51-59@.95° 

TE 

* (first quarter only) 


1971} 41-54] $2.60 





6|51-39| 95" 


* (first quarter only) 


The meanings of the key-letters describing the align- 
ment of each entry are: 


Cc center n numerical 
r = right-adjust a subcolumn 
{ left-adjust S spanned 


The global table options are center, expand, box, 
doublebox, alibox, tab (x) and linesize (7). 


TS (with delim SS on, see pane! 3) 
doublebox, center; 
cc 


Pt 
Name @Definition 


..Sp 
Gamma @®SGAMMA (z) = int sub O sup inf \ 


t sup {z-1] e sup -t dtS 
Sine @Ssin (x) = 1 over 2i ( e sup ix - e sup -ix )S 
Error DS roman erf (z) = 2 over sart pi \ 
int sub O sup ze sup {-t sup 2} atS 
Besse! @S$ J sub O (z) = 1 over pi \ 
int sub O sup pi cos (z sin theta ) d theta S 
Zeta @§ zeta (s) = \ 
sum from k=1 to inf k sup -s ““(Re’s > 1)S 
E 


Definition 


rae feta 


sin(x)— (e'*—e7') 


= 2 sae: 
erftcha—e Se dt 
Jat Jf, cos(zsind)d8é 





t(s)= Sk (Re s>l) 
cml 


Usage 


Documents with just text: 
troff -ms files 
With equations only: 
eqn files | troff -ms 
With tables only: 
tbl files | troff -ms 
With both tables and equations: 
tbl filesjeqn! troff -ms 


The above generates STARE output on GCOS: replace 
—st with —ph for tvypesetter output. 
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A Revised Version of-ms 


Bill Tuthill 


Computing Services 
University of California 
Berkeley, CA 94720 


The -ms macros have been slightly revised and rearranged. Because of the rearrange- 
ment, the new macros can be read by the computer in about half the time required by the 
previous version of -ms. This means that output will begin to appear between ten seconds and 
several minutes more quickly, depending on the system load. On long files, however, the sav- 
ings in total time are not substantial. The old version of-ms is still available as —mos. 


Several bugs in-ms have been fixed, including a bad problem with the .1C macro, minor 
difficulties with boxed text, a break induced by .EQ before initialization, the failure to set tab 
stops in displays, and several bothersome errors in the refer macros. Macros used only at 
Bell Laboratories have been removed. There are a few extensions to previous -ms macros, and 
a number of new macros, but all the documented -ms macros still work exactly as they did 
before, and have the same names as before. Output produced with-ms should look like out- 
put produced with—mos. 


One important new feature is automatically numbered footnotes. Footnote numbers are 
printed by means of a pre-defined string (\“*), which you invoke separately from .FS and .FE. 
Each time it is used, this string increases the footnote number by one, whether or not you use 
.FS and .FE in your text. Footnote numbers will be superscripted on the phototypesetter and 
on daisy-wheel terminals, but on low-resolution devices (such as the Ipr and a crt), they will 
be bracketed. If you use )** to indicate numbered footnotes, then the .FS macro will 
automatically include the footnote number at the bottom of the page. This footnote, for 
example, was produced as follows:! 


This footnote, for example, was produced as follows:\** 
FS 


FE 


If you are using \** to number footnotes, but want a particular footnote to be marked with an 
asterisk or a dagger, then give that mark as the first argument to .FS: f 


then give that mark as the first argument to .FS: \(dg 
FS \(dg 


.FE ; 


Footnote numbering will be temporarily suspended, because the \** string is not used. 
Instead of a dagger, you could use an asterisk * or double dagger {, represented as \(dd. 


! If you never use the “\**” string, no footnote numbers will appear anywhere in the text, including down 
here. The output footnotes will look exactly like footnotes produced with -mos. 


+ In the footnote, the dagger will appear where the footnote number would otherwise appear, as on the 
left. 
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Another new feature is a macro for printing theses according to Berkeley standards. 
This macro is called .TM, which stands for thesis mode. (It is much like the .th macro in 
“me.) It will put page numbers in the upper right-hand corner; number the first page; suppress 
the date; and doublespace everything except quotes, displays, and keeps. Use it at the top of 
each file making up your thesis. Calling .TM defines the .CT macro for chapter titles, which 
skips to a new page and moves the pagenumber to the center footer. The .P1 (P one) macro 
can be used even without thesis mode to print the header on page 1, which is suppressed 
except in thesis mode. If you want roman numeral page numbering, use an “.af PN i” 
request. 


There is a new macro especially for bibliography entries, called .XP, which stands for 
exdented paragraph. It will exdent the first line of the paragraph by\n(PI units, usually 5n 
(the same as the indent for the first line of a .PP). Most bibliographies are printed this way. 
Here are some examples of exdented paragraphs: 


Lumley, Lyle S., Sex in Crustaceans: Shell Fish Habits, Harbinger Press, Tampa Bay and 
San Diego, October 1979. 243 pages. The pioneering work in this field. 


Leffadinger, Harry A., ‘“Mollusk Mating Season: 52 Weeks, or All Year?” in Acta Biologica, 
vol. 42, no. 11, November 1980. A provocative thesis, but the conclusions are wrong. 


Of course, you will have to take care of italicizing the book title and journal, and quoting the 
title of the journal article. Indentation or exdentation can be changed by setting the value of 
number register PI. 


If you need to produce endnotes rather than footnotes, put the references in a file of 
their own. This is similar to what you would do if you were typing the paper on a conven- 
tional typewriter. Note that you can use automatic footnote numbering without actually hav- 
ing .FS and .FE pairs in your text. If you place footnotes in a separate file, you can use .IP 
macros with \** as a hanging tag; this will give you numbers at the left-hand margin. With 
some styles of endnotes, you would want to use .PP rather then .IP macros, and specify \* 
before the reference begins. 


There are four new macros to help produce a table of contents. Table of contents 
entries must be enclosed in .XS and .XE pairs, with optional .XA macros for additional 
entries; arguments to .XS and .XA specify the page number, to be printed at the right. A 
final .PX macro prints out the table of contents. Here is a sample of typical input and output 
text: 


XS ii 
Introduction 
XA 1 
Chapter 1: Review of the Literature 
xA 23 
Chapter 2: Experimental Evidence 
XE 
PX 
Table of Contents 
DATTOOMCHION: sets Ainceireiesat Reese ours aaree Areas eet haaee Pe a ea tas ii 
Chapter 1: Review of the Literature ..............ccccccsssccccsssecceesssseccessseeceesssecessesenseeseess 1 
Chapter 2: Experimental Evidence ............:csscccsssssccssccssccssscceeceesacsosscesscessseseeesesceseees 23 


The .XS and .XE pairs may also be used in the text, after a section header for instance, in 
which case page numbers are supplied automatically. However, most documents that require 
a table of contents are too long to produce in one run, which is necessary if this method is to 
work. It is recommended that you do a table of contents after finishing your document. To 
print out the table of contents, use the .PX macro; if you forget it, nothing will happen. 
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As an aid in producing text that will format correctly with both nroff and troff, there 
are some new string definitions that define quotation marks and dashes for each of these two 
formatting programs. The \* string will yield two hyphens in nroff, but in troff it will pro- 
duce an em dash— like this one. The \*Q and \*U strings will produce “ and ” in troff, but ” 
in nroff. (In typesetting, the double quote is traditionally considered bad form.) 


There are now a large number of optional foreign accent marks defined by the-ms mac- 
ros. All the accent marks available in—mos are present, and they all work just as they always 
did. However, there are better definitions available by placing .AM at the beginning of your 
document. Unlike the—mos accent marks, the accent strings should come after the letter 
being accented. Here is a list of the diacritical marks, with examples of what they look like. 





name of accent input output 
acute accent e\*’ e& 
grave accent e\** e 
circumflex o\*. o 
cedilla c\r,” c, 
tilde n\*_ n™ 
question \*? 

exclamation VY! 

umlaut u\: u” 
digraph s \*8 

hacek c\tv c 
macron a\* a 
underdot s\*. s 
o-slash o\*/ o 
angstrom a\*o a 
yogh kni\*3t knit 
Thorn \*(Th 

thorn \*(th 

Eth \*(D- 

eth \*(d- 

hooked o \Fq 

ae ligature \¥(ae 

AE ligature \* (Ae 

oe ligature \¥(oe 

OE ligature \* (Oe 


If you want to use these new diacritical marks, don’t forget the .AM at the top of your file. 
Without it, some will not print at all, and others will be placed on the wrong letter. 


It is also possible to produce custom headers and footers that are different on even and 
odd pages. The .OH and .EH macros define odd and even headers, while .OF and .EF define 
odd and even footers. Arguments to these four macros are specified as with .tl. This docu- 
ment was produced with: 


.OH “\fIThe -mx Macros’Page %\fP’ 
EH ’\fIPage %”The -mx Macros\fP’ 


Note that it would be a error to have an apostrophe in the header text; if you need one, you 
will have to use a different delimiter around the left, center, and right portions of the title. 
You can use any character as a delimiter, provided it doesn’t appear elsewhere in the argu- 
ment to .OH, .EH, .OF, or EF. 


The—ms macros work in conjunction with the tbl, eqn, and refer preprocessors. Mac- 
ros to deal with these items are read in only as needed, as are the thesis macros (.TM), the 
special accent mark definitions (.AM), table of contents macros (.XS and .XE), and macros to 
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format the optional cover page. The code for the ms package lives in /usr/lib/tmac/tmac.s, 
and sourced files reside in the directory /usr/ucb/lib/ms. 
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WRITING PAPERS WITH NROFF USING —-ME 


Eric P. Allman 


Electronics Research Laboratory 
University of California, Berkeley 
Berkeley, California 94720 


This document describes the text processing facilities available on the UNIX} operating 
system via NROFFT and the —me macro package. It is assumed that the reader already is gen- 
erally familiar with the UNIX operating system and a text editor such as ex. This is intended 
to be a casual introduction, and as such not all material is covered. In particular, many varia- 
tions and additional features of the —~me macro package are not explained. For a complete 
discussion of this and other issues, see The —me Reference Manual and The NROFF/TROFF 
Reference Manual. 


NROFF, a computer program that runs on the UNIX operating system, reads an input file 
prepared by the user and outputs a formatted paper suitable for publication or framing. The 
input consists of text, or words to be printed, and requests, which give instructions to the 
NROFF program telling how to format the printed copy. 


Section 1 describes the basics of text processing. Section 2 describes the basic requests. 
Section 3 introduces displays. Annotations, such as footnotes, are handled in section 4. The 
more complex requests which are not discussed in section 2 are covered in section 5. Finally, 
section 6 discusses things you will need to know if you want to typeset documents. If you are 
a novice, you probably won’t want to read beyond section 4 until you have tried some of the 
basic features out. 


When you have your raw text ready, call the NROFF formatter by typing as a request to 
the UNIX shell: 


nroff —me —Ttype files 


where type describes the type of terminal you are outputting to. Common values are dte for 
a DTC 300s (daisy-wheel type) printer and Ipr for the line printer. If the —T flag is omitted, 
a “lowest common denominator” terminal is assumed; this is good for previewing output on 
most terminals. A complete description of options to the NROFF command can be found in 
The NROFF/TROFF Reference Manual. 

The word argument is used in this manual to mean a word or number which appears on 
the same line as a request which modifies the meaning of that request. For example, the 
request 


.sp 
spaces one line, but 
sp 4 
spaces four lines. The number 4 is an argument to the .sp request which says to space four 
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lines instead of one. Arguments are separated from the request and from each other by 
spaces. 


1. Basics of Text Processing 


The primary function of NROFF is to collect words from input lines, fill output lines 
with those words, justify the right hand margin by inserting extra spaces in the line, and 
output the result. For example, the input: 


Now is the time 

for all good men 

to come to the aid 
of their party. 

Four score and seven 
years ago,... 


will be read, packed onto output lines, and justified to produce: 


Now is the time for all good men to come to the aid of their party. Four score 
and seven years ago.... 


Sometimes you may want to start a new output line even though the line you are on is not 
yet full; for example, at the end of a paragraph. To do this you can cause a break, which 
starts a new output line. Some requests cause a break automatically, as do blank input 
lines and input lines beginning with a space. 


Not all input lines are text to be formatted. Some of the input lines are requests 
which describe how to format the text. Requests always have a period or an apostrophe 
(““”’) as the first character of the input line. 


The text formatter also does more complex things, such as automatically numbering 
pages, skipping over page folds, putting footnotes in the correct place, and so forth. 


I can offer you a few hints for preparing text for input to NROFF. First, keep the 
input lines short. Short input lines are easier to edit, and NROFF will pack words onto 
longer lines for you anyhow. In keeping with this, it is helpful to begin a new line after 
every period, comma, or phrase, since common corrections are to add or delete sentences or 
phrases. Second, do not put spaces at the end of lines, since this can sometimes confuse 
the NROFF processor. Third, do not hyphenate words at the end of lines (except words 
that should have hyphens in them, such as “mother-in-law”); NROFF is smart enough to 
hyphenate words for you as needed, but is not smart enough to take hyphens out and join 
a word back together. Also, words such as “mother-in-law” should not be broken over a 
line, since then you will get a space where not wanted, such as “mother- in-law”. 


2. Basic Requests 


2.1. Paragraphs 
Paragraphs are begun by using the .pp request. For example, the input: 


‘pp 

Now is the time for all good men 
to come to the aid of their party. 
Four score and seven years ago,... 


produces a blank line followed by an indented first line. The result is: 


TUNIX, NROFF, and TROFF are Trademarks of Bell Laboratories 
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Now is the time for all good men to come to the aid of their party. Four 
score and seven years ago... 


Notice that the sentences of the paragraphs must not begin with a space, since 
blank lines and lines begining with spaces cause a break. For example, if I had typed: 


-pp 
Now is the time for all good men 

to come to the aid of their party. 
Four score and seven years ago.... 


The output would be: 


Now is the time for all good men 
to come to the aid of their party. Four score and seven years ago... 


A new line begins after the word “men” because the second line began with a space 
character. 


There are many fancier types of paragraphs, which will be described later. 


2.2. Headers and Footers 


Arbitrary headers and footers can be put at the top and bottom of every page. 
Two requests of the form .he title and .fo title define the titles to put at the head and 
the foot of every page, respectively. The titles are called three-part titles, that is, there 
is a left-justified part, a centered part, and a right-justified part. To separate these 
three parts the first character of title (whatever it may be) is used as a delimiter. Any 
character may be used, but backslash and double quote marks should be avoided. The 
percent sign is replaced by the current page number whenever found in the title. For 
example, the input: 


whe ag 4 af 
.fo “Jane Jones” My Book’ 


results in the page number centered at the top of each page, “Jane Jones” in the lower 
left corner, and “My Book” in the lower right corner. 


2.3. Double Spacing 
NROFF will double space output text automatically if you use the request .ls 2, as 


is done in this section. You can revert to single spaced mode by typing .Is 1. 


2.4. Page Layout 


A number of requests allow you to change the way the printed copy looks, some- 
times called the layout of the output page. Most of these requests adjust the placing of 
“white space” (blank lines or spaces). In these explanations, characters in italics should 
be replaced with values you wish to use; bold characters represent characters which 
should actually be typed. 


The .bp request starts a new page. 


The request .sp N leaves N lines of blank space. N can be omitted (meaning skip 
a single line) or can be of the form Ni (for N inches) or Ne (for N centimeters). For 
example, the input: 
sp 1.51 
My thoughts on the subject 
Sp 
leaves one and a half inches of space, followed by the line “My thoughts on the 
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subject’, followed by a single blank line. 


The .in +N request changes the amount of white space on the left of the page 
(the indent). The argument N can be of the form +N (meaning leave N spaces more 
than you are already leaving), —N (meaning leave less than you do now), or just N 
(meaning leave exactly N spaces). N can be of the form Ni or Ne also. For example, 
the input: 


initial text 

in 5 

some text 

Jin +1i 

more text 

in —2c 

final text 
produces ‘“‘some text” indented exactly five spaces from the left margin, “more text” 
indented five spaces plus one inch from the left margin (fifteen spaces on a pica type- 
writer), and “final text” indented five spaces plus one inch minus two centimeters from 
the margin. That is, the output is: 


initial text 
some text 
more text 
final text 


The .ti +N (temporary indent) request is used like .in +N when the indent 
should apply to one line only, after which it should revert to the previous indent. For 
example, the input: 

An li 

ti 0 

Ware, James R. The Best of Confucius, 

Halcyon House, 1950. 

An excellent book containing translations of 

most of Confucius’ most delightful sayings. 

A definite must for anyone interested in the early foundations 
of Chinese philosophy. 


produces: 

Ware, James R. The Best of Confucius, Halcyon House, 1950. An excellent book con- 
taining translations of most of Confucius’ most delightful sayings. A 
definite must for anyone interested in the early foundations of Chinese 
philosophy. 

Text lines can be centered by using the .ce request. The line after the .ce is cen- 
tered (horizontally) on the page. To center more than one line, use .ce N (where N is 
the number of lines to center), followed by the N lines. If you want to center many 
lines but don’t want to count them, type: 

.ce 1000 
lines to center 
.ce 0 


’ The .ce 0 request tells NROFF to center zero more lines, in other words, stop centering. 


All of these requests cause a break; that is, they always start a new line. If you 
want to start a new line without performing any other action, use .br. 
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2.5. Underlining 


Text can be underlined using the .ul request. The .ul request causes the next 
input line to be underlined when output. You can underline multiple lines by stating a 
count of input lines to underline, followed by those lines (as with the .ce request). For 
example, the input: 


ul 2 
Notice that these two input lines 
are underlined. 


will underline those eight words in NROFF. (In TROFF they will be set in italics.) 


3. Displays 
Displays are sections of text to be set off from the body of the paper. Major quotes, 
tables, and figures are types of displays, as are all the examples used in this document. All 
displays except centered blocks are output single spaced. 


3.1. Major Quotes 


Major quotes are quotes which are several lines long, and hence are set in from 
the rest of the text without quote marks around them. These can be generated using 
the commmands .(q and .)q to surround the quote. For example, the input: 


As Weizenbaum points out: 
(q 
It is said that to explain is to explain away. 
This maxim is nowhere so well fulfilled 
as in the areas of computer programming.,... 
)q 

generates as output: 


As Weizenbaum points out: 


It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as 
in the areas of computer programming.,... 


3.2. Lists 


A list is an indented, single spaced, unfilled display. Lists should be used when 
the material to be printed should not be filled and justified like normal text, such as 
columns of figures or the examples used in this paper. Lists are surrounded by the 
requests .(1 and .)]. For example, type: 


Alternatives to avoid deadlock are: 

mél 

Lock in a specified order 

Detect deadlock and back out one process 
Lock all resources needed before proceeding 


1 


will produce: 
Alternatives to avoid deadlock are: 


Lock in a specified order 
Detect deadlock and back out one process 
Lock all resources needed before proceeding 
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3.3. Keeps 


A keep is a display of lines which are kept on a single page if possible. An exam- 
ple of where you would use a keep might be a diagram. Keeps differ from lists in that 
lists may be broken over a page boundary whereas keeps will not. 


Blocks are the basic kind of keep. They begin with the request .(b and end with 
the request .)b. If there is not room on the current page for everything in the block, a 
new page is begun. This has the unpleasant effect of leaving blank space at the bottom 
of the page. When this is not appropriate, you can use the alternative, called floating 
keeps. 


Floating keeps move relative to the text. Hence, they are good for things which 
will be referred to by name, such as “See figure 3”. A floating keep will appear at the 
bottom of the current page if it will fit; otherwise, it will appear at the top of the next 
page. Floating keeps begin with the line .(z and end with the line .)z. For an example 
of a floating keep, see figure 1. The .hl request is used to draw a horizontal line so that 
the figure stands out from the text. 


3.4. Fancier Displays 


Keeps and lists are normally collected in nofill mode, so that they are good for 
tables and such. If you want a display in fill mode (for text), type .(1 F (Throughout 
this section, comments applied to .(1 also apply to .(b and .(z). This kind of display 
will be indented from both margins. For example, the input: 


ALF 

And now boys and girls, 

a newer, bigger, better toy than ever before! 

Be the first on your block to have your own computer! 
Yes kids, you too can have one of these modern 

data processing devices. 

You too can produce beautifully formatted papers 
without even batting an eye! 


I 


will be output as: 


(z 

hl 

Text of keep to be floated. 

sp 

.ce 

Figure 1. Example of a Floating Keep. 
-hl 

)Z 


Figure 1. Example of a Floating Keep. 
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And now boys and girls, a newer, bigger, better toy than ever before! Be the 
first on your block to have your own computer! Yes kids, you too can have one 
of these modern data processing devices. You too can produce beautifully for- 
matted papers without even batting an eye! 


Lists and blocks are also normally indented (floating keeps are normally left 
justified). To get a left-justified list, type .(1 LL. To get a list centered line-for-line, 
type .(1 C. For example, to get a filled, left justified list, enter: 


ALLE 
text of block 
D1 


The input: 


l 
first line of unfilled display 
more lines 


a) | 
produces the indented text: 


first line of unfilled display 
more lines 


Typing the character L after the .(1 request produces the left justified result: 


first line of unfilled display 
more lines 


Using C instead of L produces the line-at-a-time centered output: 


first line of unfilled display 
more lines 


Sometimes it may be that you want to center several lines as a group, rather than 
centering them one line at a time. To do this use centered blocks, which are sur- 
rounded by the requests .(e and .)e. All the lines are centered as a unit, such that the 
longest line is centered and the rest are lined up around that line. Notice that lines do 
not move relative to each other using centered blocks, whereas they do using the C 
argument to keeps. 


Centered blocks are not keeps, and may be used in conjunction with keeps. For 
example, to center a group of lines as a unit and keep them on one page, use: 


b L 

(c 

first line of unfilled display 
more lines 

Jc 


)b 
to produce: 


first line of unfilled display 
more lines 


If the block requests (.(b and .)b) had been omitted the result would have been the 
same, but with no guarantee that the lines of the centered block would have all been on 
one page. Note the use of the L argument to .(b; this causes the centered block to 
center within the entire line rather than within the line minus the indent. Also, the 
center requests must be nested inside the keep requests. 
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4. Annotations 


There are a number of requests to save text for later printing. Footnotes are printed 
at the bottom of the current page. Delayed text is intended to be a variant form of foot- 
note; the text is printed only when explicitly called for, such as at the end of each chapter. 
Indexes are a type of delayed text having a tag (usually the page number) attached to each 
entry after a row of dots. Indexes are also saved until called for explicitly. 


4.1. Footnotes 


Footnotes begin with the request .(f and end with the request .)f. The current 
footnote number is maintained automatically, and can be used by typing \**, to pro- 
duce a footnote number!. The number is automatically incremented after every foot- 
note. For example, the input: 


(q 
A man who is not upright 
and at the same time is presumptuous; 
one who is not diligent and at the same time is ignorant; 
one who is untruthful and at the same time is incompetent; 
such men I do not count among acquaintances.\‘* 
Af 
\** James R. Ware, 
cul 
The Best of Confucius, 
Halcyon House, 1950. 
Page 77. 
Jf 
)q 
generates the result: 


A man who is not upright and at the same time is presumptuous; one who is not dili- 
gent and at the same time is ignorant; one who is untruthful and at the same time is in- 


competent; such men I do not count among acquaintances.” 


It is important that the footnote appears inside the quote, so that you can be sure that 
the footnote will appear on the same page as the quote. 


4.2. Delayed Text 


Delayed text is very similar to a footnote except that it is printed when called for 
explicitly. This allows a list of references to appear (for example) at the end of each 
chapter, as is the convention in some disciplines. Use \*# on delayed text instead of \** 
as on footnotes. 


If you are using delayed text as your standard reference mechanism, you can still 
use footnotes, except that you may want to reference them with special characters* 
rather than numbers. 


4.3. Indexes 


An “index” (actually more like a table of contents, since the entries are not sorted 
alphabetically) resembles delayed text, in that it is saved until called for. However, 
each entry has the page number (or some other tag) appended to the last line of the 


‘Like this. 


"James R. Ware, The Best of Confucius, Halcyon House, 1950. Page 77. 
*Such as an asterisk. 
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index entry after a row of dots. 


Index entries begin with the request .(x and end with .)x. The .)x request may 
have a argument, which is the value to print as the “page number”. It defaults to the 
current page number. If the page number given is an underscore (“ ”) no page number 
or line of dots is printed at all. To get the line of dots without a page number, type .)x 
”” which specifies an explicitly null page number. 


The .xp request prints the index. 
For example, the input: 


(x 
Sealing wax 
.)x 
(x 
Cabbages and kings 
)x 
Ax 
Why the sea is boiling hot 
.)x 2.5a 
.(x 
Whether pigs have wings 
.)x 2” 
(x 
This is a terribly long index entry, such as might be used 
for a list of illustrations, tables, or figures; I expect it to 
take at least two lines. 
)X 
Xp 
generates: 
DOB MII SWS crac yexpetattisiesuskstiiecwsnabsaaneasdeasss xan atest send csbtabessued otis Souidbveaiits wleyeasteeivcen danas 29 
Cabbages and kings . 
Why The Sea 16- DOING ROE: sdeicss cccsscccsseceaterscascacd te hicecacccuevedindetecossvs eo oervasrarse anew 2.5a 
Whether pigs: have Wings. siscccsscissicisscsssesasesccsssesecssesssesasessasivssdesseatesacsssnsastedenssaavacssesesetees 
This is a terribly long index entry, such as might be used for a list of illustra- 
tions, tables, or figures; I expect it to take at least two lines. oe 29 
The .(x request may have a single character argument, specifying the “name” of 
the index; the normal index is x. Thus, several “indicies” may be maintained simul- 
taneously (such as a list of tables, table of contents, etc.). 
Notice that the index must be printed at the end of the paper, rather than at the 
beginning where it will probably appear (as a table of contents); the pages may have to 
be physically rearranged after printing. 


5. Fancier Features 


A large number of fancier requests exist, notably requests to provide other sorts of 
paragraphs, numbered sections of the form 1.2.3 (such as used in this document), and 
multicolumn output. 


5.1. More Paragraphs 
Paragraphs generally start with a blank line and with the first line indented. It is 
possible to get left-justified block-style paragraphs by using .Ip instead of .pp, as 
demonstrated by the next paragraph. 
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Sometimes you want to use paragraphs that have the body indented, and the first line 
exdented (opposite of indented) with a label. This can be done with the .ip request. A 
word specified on the same line as .ip is printed in the margin, and the body is lined up 
at a prespecified position (normally five spaces). For example, the input: 


ip one 
This is the first paragraph. 
Notice how the first line 
of the resulting paragraph lines up 
with the other lines in the paragraph. 
1p two 
And here we are at the second paragraph already. 
You may notice that the argument to .ip 
appears 
in the margin. 
lp 
We can continue text... 
produces as output: 


one This is the first paragraph. Notice how the first line of the resulting paragraph 
lines up with the other lines in the paragraph. 


two And here we are at the second paragraph already. You may notice that the argu- 
ment to .ip appears in the margin. 


We can continue text without starting a new indented paragraph by using the .lp 
request. 


If you have spaces in the label of a .ip request, you must use an “unpaddable 
space” instead of a regular space. This is typed as a backslash character (‘“‘\’) followed 
by a space. For example, to print the label “Part 1”, enter: 


ip ”Part\1” 


If a label of an indented paragraph (that is, the argument to .ip) is longer than 
the space allocated for the label, .ip will begin a new line after the label. For example, 
the input: 


ip longlabel 

This paragraph had a long label. 

The first character of text on the first line 

will not line up with the text on second and subsequent lines, 
although they will line up with each other. 


will produce: 


longlabel 
This paragraph had a long label. The first character of text on the first line will 
not line up with the text on second and subsequent lines, although they will line 
up with each other. 


It is possible to change the size of the label by using a second argument which is 
the size of the label. For example, the above example could be done correctly by say- 
ing: 

.ip longlabel 10 


which will make the paragraph indent 10 spaces for this paragraph only. If you have 
many paragraphs to indent all the same amount, use the number register ii. For exam- 
ple, to leave one inch of space for the label, type: 
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er ii li 
somewhere before the first call to .ip. Refer to the reference manual for more informa- 
tion. 

If .ip is used with no argument at all no hanging tag will be printed. For example, 
the input: 

ip [a] 
This is the first paragraph of the example. 
We have seen this sort of example before. 
Ip 
This paragraph is lined up with the previous paragraph, 
but it has no tag in the margin. 
produces as output: 
{a] This is the first paragraph of the example. We have seen this sort of example 
before. 

This paragraph is lined up with the previous paragraph, but it has no tag in the 

margin. 

A special case of .ip is np, which automatically numbers paragraphs sequentially 
from 1. The numbering is reset at the next .pp, .lp, or .sh (to be described in the next 
section) request. For example, the input: 

.np 
This is the first point. 


.np 
This is the second point. 

Points are just regular paragraphs 

which are given sequence numbers automatically 
by the .np request. 

-pp 

This paragraph will reset numbering by .np. 

np 

For example, 

we have reverted to numbering from one now. 


generates: 

(1) This is the first point. 

(2) This is the second point. Points are just regular paragraphs which are given 
sequence numbers automatically by the .np request. 
This paragraph will reset numbering by .np. 


(1) For example, we have reverted to numbering from one now. 


5.2. Section Headings 


Section numbers (such as the ones used in this document) can be automatically 
generated using the .sh request. You must tell .sh the depth of the section number 
and a section title. The depth specifies how many numbers are to appear (separated by 
decimal points) in the section number. For example, the section number 4.2.5 has a 
depth of three. 

Section numbers are incremented in a fairly intuitive fashion. If you add a 
number (increase the depth), the new number starts out at one. If you subtract section 
numbers (or keep the same number) the final number is incremented. For example, the 
input: 
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sh 1 ”The Preprocessor” 
sh 2 ”Basic Concepts” 
sh 2 ”Control Inputs” 
sh 3 

sh 3 

sh 1 ”Code Generation” 
sh 3 


produces as output the result: 


1. The Preprocessor 
1.1. Basic Concepts 
1.2. Control Inputs 

1.2.1. 

1.2.2. 

2. Code Generation 
2.1.1. 


You can specify the section number to begin by placing the section number after 
the section title, using spaces instead of dots. For example, the request: 
sh 3 ”Another section” 7 3 4 


will begin the section numbered 7.3.4; all subsequent .sh requests will number relative 
to this number. 


There are more complex features which will cause each section to be indented pro- 
portionally to the depth of the section. For example, if you enter: 


cor si N 


each section will be indented by an amount N. N must have a scaling factor attached, 
that is, it must be of the form Nx, where x is a character telling what units N is in. 
Common values for x are i for inches, ¢ for centimeters, and n for ens (the width of a 
single character). For example, to indent each section one-half inch, type: 


or si 0.51 


After this, sections will be indented by one-half inch per level of depth in the section 
number. For example, this document was produced using the request 


.nr si 3n 
at the beginning of the input file, giving three spaces of indent per section depth. 
Section headers without automatically generated numbers can be done using: 
.uh Title” 


which will do a section heading, but will put no number on the section. 


5.3. Parts of the Basic Paper 


There are some requests which assist in setting up papers. The .tp request initial- 
izes for a title page. There are no headers or footers on a title page, and unlike other 
pages you can space down and leave blank space at the top. For example, a typical title 
page might appear as: 
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.tp 

Sp 2i 

LC 

THE GROWTH OF TOENAILS 
IN UPPER PRIMATES 
Sp 

by 

Sp 

Frank N. Furter 

I 

.bp 


The request .th sets up the environment of the NROFF processor to do a thesis, 
using the rules established at Berkeley. It defines the correct headers and footers (a 
page number in the upper right hand corner only), sets the margins correctly, and dou- 
ble spaces. 


The .+c T request can be used to start chapters. Each chapter is automatically 
numbered from one, and a heading is printed at the top of each chapter with the 
chapter number and the chapter name JT. For example, to begin a chapter called “Con- 
clusions”, use the request: 


.+c >CONCLUSIONS” 
which will produce, on a new page, the lines 


CHAPTER 5 
CONCLUSIONS 


with appropriate spacing for a thesis. Also, the header is moved to the foot of the page 
on the first page of a chapter. Although the .+c¢ request was not designed to work only 
with the .th request, it is tuned for the format acceptable for a PhD thesis at Berkeley. 


If the title parameter T is omitted from the .+e request, the result is a chapter 
with no heading. This can also be used at the beginning of a paper; for example, .+¢ 
was used to generate page one of this document. 


Although papers traditionally have the abstract, table of contents, and so forth at 
the front of the paper, it is more convenient to format and print them last when using 
NROFF. This is so that index entries can be collected and then printed for the table of 
contents (or whatever). At the end of the paper, issue the .++ P request, which begins 
the preliminary part of the paper. After issuing this request, the .+¢ request will begin 
a preliminary section of the paper. Most notably, this prints the page number restarted 
from one in lower case Roman numbers. .+¢ may be used repeatedly to begin different 
parts of the front material for example, the abstract, the table of contents, acknowledg- 
ments, list of illustrations, etc. The request .++ B may also be used to begin the 
bibliographic section at the end of the paper. For example, the paper might appear as 
outlined in figure 2. (In this figure, comments begin with the sequence \’.) 


5.4. Equations and Tables 


Two special UNIX programs exist to format special types of material. Eqn and 
neqn set equations for the phototypesetter and NROFF respectively. Tbl arranges to 
print extremely pretty tables in a variety of formats. This document will only describe 
the embellishments to the standard features; consult the reference manuals for those 
processors for a description of their use. 


The eqn and neqn programs are described fully in the document Typesetting 
Mathematics — Users’ Guide by Brian W. Kernighan and Lorinda L. Cherry. 
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.th \” set for thesis mode 

fo “DRAFT” \’ define footer for each page 
.tp \’ begin title page 

LC \’ center a large block 


THE GROWTH OF TOENAILS 

IN UPPER PRIMATES 

Sp 

by 

Sp 

Frank Furter 

JI \” end centered part 

tc INTRODUCTION \’ begin chapter named INTRODUCTION” 
Ax t \’ make an entry into index ‘t’ 
Introduction 

.)x ¥ end of index entry 

text of chapter one 

tc "NEXT CHAPTER” \’ begin another chapter 

Ax t \” enter into index ‘t’ again 
Next Chapter 

.)x 

text of chapter two 

+c CONCLUSIONS 

(xt 

Conclusions 

x 

text of chapter three 

++ B \” begin bibliographic information 
.+c BIBLIOGRAPHY \” begin another ‘chapter’ 

(xt 

Bibliography 

.)x 

text of bibliography 

EP \ begin preliminary material 
tc "TABLE OF CONTENTS” 

xp t \? print index ‘t’ collected above 
+c PREFACE \” begin another preliminary section 
text of preface 


Figure 2. Outline of a Sample Paper 


Equations are centered, and are kept on one page. They are introduced by the .EQ 
request and terminated by the .EN request. 

The .EQ request may take an equation number as an optional argument, which is 
printed vertically centered on the right hand side of the equation. If the equation 
becomes too long it should be split between two lines. To do this, type: 
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.EQ (eq 34) 

text of equation 34 

.EN C 

.EQ 

continuation of equation 34 
.EN 


The C on the .EN request specifies that the equation will be continued. 


The tbl program produces tables. It is fully described (including numerous exam- 
ples) in the document Tb! — A Program to Format Tables by M. E. Lesk. Tables begin 
with the .TS request and end with the .TE request. Tables are normally kept on a sin- 
gle page. If you have a table which is too big to fit on a single page, so that you know it 
will extend to several pages, begin the table with the request .TS H and put the 
request .TH after the part of the table which you want duplicated at the top of every 
page that the table is printed on. For example, a table definition for a long table might 
look like: 


.TSH 

css 

nnn. 

THE TABLE TITLE 
.TH 

text of the table 

.TE 


5.5. Two Column Output 


You can get two column output automatically by using the request .2c. This 
causes everything after it to be output in two-column form. The request .be will start a 
new column; it differs from .bp in that .bp may leave a totally blank column when it 
starts a new page. To revert to single column output, use .le. 


5.6. Defining Macros 


A macro is a collection of requests and text which may be used by stating a simple 
request. Macros begin with the line .de xx (where xx is the name of the macro to be 
defined) and end with the line consisting of two dots. After defining the macro, stating 
the line .xx is the same as stating all the other lines. For example, to define a macro 
that spaces 3 lines and then centers the next input line, enter: 


.de SS 
Sp 3 
ce 


and use it by typing: 


SS 
Title Line 
(beginning of text) 


Macro names may be one or two characters. In order to avoid conflicts with 
names in —me, always use upper case letters as names. The only names to avoid are 
TS, TH, TE, EQ, and EN. 


5.7. Annotations Inside Keeps 


Sometimes you may want to put a footnote or index entry inside a keep. For 
example, if you want to maintain a “list of figures” you will want to do something like: 
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(z 

(c 

text of figure 
ec 

.ce 

Figure 5. 
(xf 

Figure 5 

.)x 


)Z 


which you may hope will give you a figure with a label and an entry in the index f 
(presumably a list of figures index). Unfortunately, the index entry is read and inter- 
preted when the keep is read, not when it is printed, so the page number in the index is 
likely to be wrong. The solution is to use the magic string \! at the beginning of all the 
lines dealing with the index. In other words, you should use: 


A(z 

(c 

Text of figure 
Jc 

ce 

Figure 5. 

\L(x f 
\!Figure 5 
\!.)x 


)Z 


which will defer the processing of the index until the figure is output. This will guaran- 
tee that the page number in the index is correct. The same comments apply to blocks 
(with .(b and .)b) as well. 


6. TROFF and the Photosetter 


With a little care, you can prepare documents that will print nicely on either a regu- 
lar terminal or when phototypeset using the TROFF formatting program. 


6.1. Fonts 


A font is a style of type. There are three fonts that are available simultaneously, 
Times Roman, Times Italic, and Times Bold, plus the special math font. The normal 
font is Roman. Text which would be underlined in NROFF with the .ul request is set in 
italics in TROFF. 


There are ways of switching between fonts. The requests .r, .i, and .b switch to 
Roman, italic, and bold fonts respectively. You can set a single word in some font by 
typing (for example): 


.l word 


which will set word in italics but does not affect the surrounding text. In NROFF, italic 
and bold text is underlined. 


Notice that if you are setting more than one word in whatever font, you must sur- 
round that word with double quote marks (‘”’) so that it will appear to the NROFF pro- 
cessor as a single word. The quote marks will not appear in the formatted text. If you 
do want a quote mark to appear, you should quote the entire string (even if a single 
word), and use two quote marks where you want one to appear. For example, if you 
want to produce the text: 
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”Master Control” 
in italics, you must type: 

i””’Master Control \|””” 
The\| produces a very narrow space so that the “1” does not overlap the quote sign in 
TROFF, like this: 


”Master Control” 


There are also several “pseudo-fonts” available. The input: 


b 

.u underlined 

.bi ”bold italics” 

.bx ”words in a box” 

)b 
generates 

underlined 

bold italics 

words in a box 

In NROFF these all just underline the text. Notice that pseudo font requests set only 
the single parameter in the pseudo font; ordinary font requests will begin setting all 
text in the special font if you do not provide a parameter. No more than one word 
should appear with these three font requests in the middle of lines. This is because of 
the way TROFF justifies text. For example, if you were to issue the requests: 


.bi ”some bold italics” 
and 
.bx ”words in a box” 

Ng ‘ ae ae 
in the middle of a line TROFF would produce swme Unllil ittdliss and iwords in a Dox!, 
which I think you will agree does not look good. 

The second parameter of all font requests is set in the original font. For example, 
the font request: 
.b bold face 
generates “bold” in bold font, but sets “face” in the font of the surrounding text, 
resulting in: 
boldface. 
To set the two words bold and face both in bold face, type: 


.b ”bold face” 
You can mix fonts in a word by using the special sequence \c at the end of a line 


to indicate “continue text processing”; this allows input lines to be joined together 
without a space inbetween them. For example, the input: 


.u under \c 
.i italics 
generates underitalics, but if we had typed: 
.u under 
i italics 


the result would have been under italics as two words. 
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6.2. Point Sizes 
The phototypesetter supports different sizes of type, measured in points. The 
default point size is 10 points for most text, 8 points for footnotes. To change the 
pointsize, type: 
sz +N 


where WN is the size wanted in points. The vertical spacing (distance between the bot- 
tom of most letters (the baseline) between adjacent lines) is set to be proportional to 
the type size. 


Warning: changing point sizes on the phototypesetter is a slow mechanical opera- 
tion. Size changes should be considered carefully. 


6.3. Quotes 


It is conventional when using the typesetter to use pairs of grave and acute 
accents to generate double quotes, rather than the double quote character (‘”’). This is 
because it looks better to use grave and acute accents; for example, compare ”quote” to 
“quote”. 


In order to make quotes compatible between the typesetter and terminals, you 
may use the sequences \*(lq and \*(rq to stand for the left and right quote respec- 


tively. These both appear as ” on most terminals, but are typeset as “ and ” respec- 
tively. For example, use: 


\*(IqSome things aren’t true 
even if they did happen.\*(rq 


to generate the result: 

“Some things aren’t true even if they did happen.” 
As a shorthand, the special font request: 

.q "quoted text” 


will generate “quoted text”. Notice that you must surround the material to be quoted 
with double quote marks if it is more than one word. 
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This document describes in extremely terse form the features of the -me macro package 
for version seven NROFF/TROFF. Some familiarity is assumed with those programs, 
specifically, the reader should understand breaks, fonts, pointsizes, the use and definition of 
number registers and strings, how to define macros, and scaling factors for ens, points, v’s 
(vertical line spaces), etc. 


For a more casual introduction to text processing using NROFF, refer to the document 
Writing Papers with NROFF using —me. 


There are a number of macro parameters that may be adjusted. Fonts may be set to a 
font number only. In NROFF font 8 is underlined, and is set in bold font in TROFF (although 
font 3, bold in TROFF, is not underlined in NROFF). Font 0 is no font change; the font of the 
surrounding text is used instead. Notice that fonts 0 and 8 are “pseudo-fonts”; that is, they 
are simulated by the macros. This means that although it is legal to set a font register to zero 
or eight, it is not legal to use the escape character form, such as: 


\f8 


All distances are in basic units, so it is nearly always necessary to use a scaling factor. 
For example, the request to set the paragraph indent to eight one-en spaces is: 


-nr pi 8n 
and not 

or pi 8 
which would set the paragraph indent to eight basic units, or about 0.02 inch. Default param- 
eter values are given in brackets in the remainder of this document. 


Registers and strings of the form $x may be used in expressions but should not be 
changed. Macros of the form $x perform some function (as described) and may be redefined 
to change this function. This may be a sensitive operation; look at the body of the original 
macro before changing it. 


All names in —me follow a rigid naming convention. The user may define number regis- 
ters, strings, and macros, provided that s/he uses single character upper case names or double 
character names consisting of letters and digits, with at least one upper case letter. In no case 
should special characters be used in user-defined names. 


On daisy wheel type printers .in twelve pitch, the —rx1 flag can be stated to make lines 
default to one eighth inch (the normal spacing for a newline in twelve-pitch). This is normally 
too small for easy readability, so the default is to space one sixth inch. 


+NROFF and TROFF are Trademarks of Bell Laboratories. 
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1. Paragraphing 


These macros are used to begin paragraphs. The standard paragraph macro is .pp; the 
others are all variants to be used for special purposes. 


The first call to one of the paragraphing macros defined in this section or the .sh macro 
(defined in the next session) initializes the macro processor. After initialization it is not pos- 
sible to use any of the following requests: .sc, .lo, .th, or .ac. Also, the effects of changing 
parameters which will have a global effect on the format of the page (notably page length and 
header and footer margins) are not well defined and should be avoided. 


lp Begin left-justified paragraph. Centering and underlining are turned 
off if they were on, the font is set to \n(pf [1] the type size is set to 
\n(pp [10p], and a \n(ps space is inserted before the paragraph [0.35v 
in TROFF, lv or 0.5v in NROFF depending on device resolution]. The 
indent is reset to \n($i [0] plus \n(po [0] unless the paragraph is 
inside a display. (see .ba). At least the first two lines of the para- 
graph are kept together on a page. 


pp Like .lp, except that it puts \n(pi [5n] units of indent. This is the 
standard paragraph macro. 


ap TI Indented paragraph with hanging tag. The body of the following para- 
graph is indented J spaces (or \n(ii [5n] spaces if J is not specified) 
more than a non-indented paragraph (such as with .pp) is. The title 
T is exdented (opposite of indented). The result is a paragraph with 
an even left edge and T printed in the margin. Any spaces in T must 
be unpaddable. If T will not fit in the space provided, .ip will start a 
new line. 


np A variant of .ip which numbers paragraphs. Numbering is reset after 
a .lp, .pp, or .sh. The current paragraph number is in \n($p. 


2. Section Headings 


Numbered sections are similiar to paragraphs except that a section number is automati- 
cally generated for each one. The section numbers are of the form 1.2.3. The depth of the 
section is the count of numbers (separated by decimal points) in the section number. 


Unnumbered section headings are similar, except that no number is attached to the 
heading. 


Sh+NTabcdef_ Begin numbered section of depth N. If N is missing the current depth 
(maintained in the number register \n($O) is used. The values of the 
individual parts of the section number are maintained in \n($1 
through \n($6. There is a \n(ss [lv] space before the section. T is 
printed as a section title in font \n(sf [8] and size \n(sp [10p]. The 
“name” of the section may be accessed via \*($n. If \n(si is non-zero, 
the base indent is set to \n(si times the section depth, and the section 
title is exdented. (See .ba.) Also, an additional indent of \n(so [0] is 
added to the section title (but not to the body of the section). The 
font is then set to the paragraph font, so that more information may 
occur on the line with the section number and title. .sh insures that 
there is enough room to print the section head plus the beginning of a 
paragraph (about 3 lines total). If a through f are specified, the sec- 
tion number is set to that number rather than incremented automati- 
cally. If any of a through f are a hyphen that number is not reset. If 
T is a single underscore (“ ”) then the section depth and numbering is 
reset, but the base indent is not reset and nothing is printed out. This 
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is useful to automatically coordinate section numbers with chapter 
numbers. 


sx +N Go to section depth N [-1], but do not print the number and title, 
and do not increment the section number at level N. This has the 
effect of starting a new paragraph at level N. 


uh T Unnumbered section heading. The title T is printed with the same 
rules for spacing, font, etc., as for .sh. 


$p TBN Print section heading. May be redefined to get fancier headings. T is 
the title passed on the .sh or .uh line; B is the section number for this 
section, and N is the depth of this section. These parameters are not 
always present; in particular, .sh passes all three, .uh passes only the 
first, and .sx passes three, but the first two are null strings. Care 
should be taken if this macro is redefined; it is quite complex and sub- 
tle. 


$0 TBN This macro is called automatically after every call to .$p. It is nor- 
mally undefined, but may be used to automatically put every section 
title into the table of contents or for some similiar function. T' is the 
section title for the section title which was just printed, B is the sec- 
tion number, and N is the section depth. 


$1 — .$6 Traps called just before printing that depth section. May be defined 
to (for example) give variable spacing before sections. These macros 
are called from .$p, so if you redefine that macro you may lose this 
feature. 


3. Headers and Footers 


Headers and footers are put at the top and bottom of every page automatically. They 
are set in font \n(tf [3] and size \n(tp [10p]. Each of the definitions apply as of the next 
page. Three-part titles must be quoted if there are two blanks adjacent anywhere in the title 
or more than eight blanks total. 


The spacing of headers and footers are controlled by three number registers. \n(hm [4v] 
is the distance from the top of the page to the top of the header, \n(fm [8v] is the distance 
from the bottom of the page to the bottom of the footer, \n(tm [Tv] is the distance from the 
top of the page to the top of the text, and \n(bm [6v] is the distance from the bottom of the 
page to the bottom of the text (nominal). The macros .m1, .m2, .m3, and .m4 are also sup- 
plied for compatibility with ROFF documents. 


ehe ‘I’m’r’ Define three-part header, to be printed on the top of every page. 

fo ‘l’m’r’ Define footer, to be printed at the bottom of every page. 

.eh ‘’m’r’ Define header, to be printed at the top of every even-numbered page. 

oh ‘l’m’r’ Define header, to be printed at the top of every odd-numbered page. 

ef ‘U’m’r’ Define footer, to be printed at the bottom of every even-numbered 
page. 

of ‘l’m’r’ Define footer, to be printed at the bottom of every odd-numbered 
page. 

-hx Suppress headers and footers on the next page. 

mil +N Set the space between the top of the page and the header [4v]. 


m2 +N Set the space between the header and the first line of text [2v]. 
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m3 +N Set the space between the bottom of the text and the footer [2v]. 
m4 +N Set the space between the footer and the bottom of the page [4v]. 
.ep End this page, but do not begin the next page. Useful for forcing out 


footnotes, but other than that hardly every used. Must be followed by 
a .bp or the end of input. 


.$h Called at every page to print the header. May be redefined to provide 
fancy (e.g., multi-line) headers, but doing so loses the function of the 
-he, .fo, .eh, .oh, .ef, and .of requests, as well as the chapter-style 
title feature of .+c. 


$f Print footer; same comments apply as in .$h. 


.$H A normally undefined macro which is called at the top of each page 
(after outputing the header, initial saved floating keeps, etc.); in other 
words, this macro is called immediately before printing text on a page. 
It can be used for column headings and the like. 


4. Displays 


All displays except centered blocks and block quotes are preceeded and followed by an 
extra \n(bs [same as \n(ps] space. Quote spacing is stored in a separate register; centered 
blocks have no default initial or trailing space. The vertical spacing of all displays except 
quotes and centered blocks is stored in register \n($R instead of \n($r. 


(lm f Begin list. Lists are single spaced, unfilled text. If f is F, the list will 
be filled. If m [I] is I the list is indented by \n(bi [4n]; if M the list is 
indented to the left margin; if L the list is left justified with respect to 
the text (different from M only if the base indent (stored in \n($i and 
set with .ba) is not zero); and if C the list is centered on a line-by-line 
basis. The list is set in font \n(df [0]. Must be matched by a .)I. 
This macro is almost like .(b except that no attempt is made to keep 
the display on one page. 


.)1 End list. 


(q Begin major quote. These are single spaced, filled, moved in from the 
text on both sides by \n(qi [4n], preceeded and followed by \n(qas 
[same as \n(bs] space, and are set in point size \n(qp [one point 
smaller than surrounding text]. 


)q End major quote. 


.(b m f Begin block. Blocks are a form of keep, where the text of a keep is 
kept together on one page if possible (keeps are useful for tables and 
figures which should not be broken over a page). If the block will not 
fit on the current page a new page is begun, unless that would leave 
more than \n(bt [0] white space at the bottom of the text. If\n(bt is 
zero, the threshold feature is turned off. Blocks are not filled unless f 
is F, when they are filled. The block will be left-justified if m is L, 
indented by \n(bi [4n] if m is I or absent, centered (line-for-line) if m 
is C, and left justified to the margin (not to the base indent) if m is 
M. The block is set in font \n(df [0]. 


)b End block. 
(zm f . Begin floating keep. Like .(b except that the keep is floated to the 
bottom of the page or the top of the next page. Therefore, its position 


relative to the text changes. The floating keep is preceeded and fol- 
lowed by \n(zs [lv] space. Also, it defaults to mode M. 


Je 


5. Annotations 


(d 


Jdn 


-_pd 


fn 


.$s 


(x x 


.)x PA 


xp x 


6. Columned Output 


2c +S N 


le 
.bc 


-me Reference Manual 5-43 


End floating keep. 


Begin centered block. The next keep is centered as a block, rather 
than on a line-by-line basis as with .(b C. This call may be nested 
inside keeps. 


End centered block. 


Begin delayed text. Everything in the next keep is saved for output 
later with .pd, in a manner similar to footnotes. 


End delayed text. The delayed text number register \n($d and the 
associated string \*# are incremented if \*# has been referenced. 


Print delayed text. Everything diverted via .(d is printed and trun- 
cated. This might be used at the end of each chapter. 


Begin footnote. The text of the footnote is floated to the bottom of 
the page and set in font \n(ff [1] and size \n(fp [8p]. Each entry is 
preceeded by \n(fs [0.2v] space, is indented \n(fi [3n] on the first line, 
and is indented \n(fu [0] from the right margin. Footnotes line up 
underneath two columned output. If the text of the footnote will not 
all fit on one page it will be carried over to the next page. 


End footnote. The number register \n($f and the associated string \** 
are incremented if they have been referenced. 


The macro to output the footnote seperator. This macro may be 
redefined to give other size lines or other types of separators. 
Currently it draws a 1.5i line. 


Begin index entry. Index entries are saved in the index x [x] until 
called up with .xp. Each entry is preceeded by a \n(xs [0.2v] space. 
Each entry is “undented” by \n(xu [0.5i]; this register tells how far 
the page number extends into the right margin. 


End index entry. The index entry is finished with a row of dots with 
A [null] right justified on the last line (such as for an author’s name), 
followed by P [\n%]. If A is specified, P must be specified; \n% can 
be used to print the current page number. If P is an underscore, no 
page number and no row of dots are printed. 


Print index x [x]. The index is formated in the font, size, and so forth 
in effect at the time it is printed, rather than at the time it is col- 
lected. 


Enter two-column mode. The column separation is set to +S [4n, 0.5i 
in ACM mode] (saved in \n($s). The column width, calculated to fill 
the single column line length with both columns, is stored in \n($l. 
The current column is in \n($c. You can test register \n($m [1] to 
see if you are in single column or double column mode. Actually, the 
request enters N [2] columned output. 


Revert to single-column mode. 


Begin column. This is like .bp except that it begins a new column on 
a new page only if necessary, rather than forcing a whole new page if 
there is another column left on the current page. 
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7. Fonts and Sizes 


sz +P 


rwx 


iWxX 
bwx 


rhb WX 


uwx 


qwx 


-bi WX 


-bx WX 


8. Roff Support 
ix +N 
bl N 


-pa +N 
ro 

ar 

nl 
n2N 
sk 


The pointsize is set to P [10p], and the line spacing is set proportion- 
ally. The ratio of line spacing to pointsize is stored in \n($r. The 
ratio used internally by displays and annotations is stored in \n($R 
(although this is not used by .sz). 


Set W in roman font, appending X in the previous font. To append 
different font requests, use X = \c. If no parameters, change to roman 
font. 


Set W in italics, appending X in the previous font. If no parameters, 
change to italic font. Underlines in NROFF. 


Set W in bold font and append X in the previous font. If no parame- 
ters, switch to bold font. In NROFF, underlines. 


Set W in bold font and append X in the previous font. If no parame- 
ters, switch to bold font. .rb differs from .b in that .rb does not 
underline in NROFF. 


Underline W and append X. This is a true underlining, as opposed to 
the .ul request, which changes to “‘underline font” (usually italics in 
TROFF). It won’t work right if W is spread or broken (including 
hyphenated). In other words, it is safe in nofill mode only. 

Quote W and append X. In NROFF this just surrounds W with double 
quote marks (‘”’), but in TROFF uses directed quotes. 

Set W in bold italics and append X. Actually, sets W in italic and 
overstrikes once. Underlines in NROFF. It won’t work right if W is 
spread or broken (including hyphenated). In other words, it is safe in 
nofill mode only. 

Sets W in a box, with X appended. Underlines in NROFF. It won’t 
work right if W is spread or broken (including hyphenated). In other 
words, it is safe in nofill mode only. 


Indent, no break. Equivalent to ‘in N. 


Leave N contiguous white space, on the next page if not enough room 
on this page. Equivalent to a.sp N inside a block. 


Equivalent to .bp. 

Set page number in roman numerals. Equivalent to .af % i. 

Set page number in arabic. Equivalent to .af % 1. 

Number lines in margin from one on each page. 

Number lines from N, stop if N = 0. 

Leave the next output page blank, except for headers and footers. 
This is used to leave space for a full-page diagram which is produced 
externally and pasted in later. To get a partial-page paste-in display, 
say .sv N, where N is the amount of space to leave; this space will be 
output immediately if there is room, and will otherwise be output at 
the top of the next page. However, be warned: if N is greater than the 


amount of available space on an empty page, no space will ever be out- 
put. 
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9. Preprocessor Support 


EQ m T 


.EN c 


TS h 


TH 
.TE 


10. Miscellaneous 
re 
.ba +N 


xl +N 


AL +N 


-hl 


lo 


Begin equation. The equation is centered if m is C or omitted, 
indented \n(bi [4n] if m is I, and left justified if m is L. T is a title 
printed on the right margin next to the equation. See Typesetting 
Mathematics — User’s Guide by Brian W. Kernighan and Lorinda L. 
Cherry. 


End equation. If c is C the equation must be continued by immedi- 
ately following with another .EQ, the text of which can be centered 
along with this one. Otherwise, the equation is printed, always on one 
page, with \n(es [0.5v in TROFF, lv in NROFF] space above and below 
it. 

Table start. Tables are single spaced and kept on one page if possible. 
If you have a large table which will not fit on one page, use h = H and 
follow the header part (to be printed on every page of the table) with 
a.TH. See Tbl — A Program to Format Tables by M. E. Lesk. 


With .TS H, ends the header portion of the table. 


Table end. Note that this table does not float, in fact, it is not even 
guaranteed to stay on one page if you use requests such as .sp inter- 
tixed with the text of the table. If you want it to float (or if you use 
requests inside the table), surround the entire table (including the .TS 


and .TE requests) with the requests .(z and .)z. 


Reset tabs. Set to every 0.51 in TROFF and every 0.8i in NROFF. 


Set the base indent to +N [0] (saved in \n($i). All paragraphs, sec- 
tions, and displays come out indented by this amount. Titles and 
footnotes are unaffected. The .sh request performs a .ba request if 
\n(si [0] is not zero, and sets the base indent to \n(si*\n($0. 


Set the line length to N [6.0i]. This differs from .ll because it only 
affects the current environment. 


Set line length in all environments to N [6.0i]. This should not be 
used after output has begun, and particularly not in two-columned 
output. The current line length is stored in \n($1. 


Draws a horizontal line the length of the page. This is useful inside 
floating keeps to differentiate between the text and the figure. 


This macro loads another set of macros (in /usr/lib/me/local.me) 
which is intended to be a set of locally defined macros. These macros 
should all be of the form .*X, where X is any letter (upper or lower 
case) or digit. 


11. Standard Papers 


.tp 


.th 


Begin title page. Spacing at the top of the page can occur, and 
headers and footers are supressed. Also, the page number is not incre- 
mented for this page. 


Set thesis mode. This defines the modes acceptable for a doctoral 
dissertation at Berkeley. It double spaces, defines the header to be a 
single page number, and changes the margins to be 1.5 inch on the left 
and one inch on the top. .++ and .t+e should be used with it. This 
macro must be stated before initialization, that is, before the first call 
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t++mdH 


$e T 


$C K NT 


acAN 


of a paragraphing macro or .sh. 


This request defines the section of the paper which we are entering. 
The section type is defined by m. C means that we are entering the 
chapter portion of the paper, A means that we are entering the appen- 
dix portion of the paper, P means that the material following should 
be the preliminary portion (abstract, table of contents, etc.) portion of 
the paper, AB means that we are entering the abstract (numbered 
independently from 1 in Arabic numerals), and B means that we are 
entering the bibliographic portion at the end of the paper. Also, the 
variants RC and RA are allowed, which specify renumbering of pages 
from one at the beginning of each chapter or appendix, respectively. 
The H parameter defines the new header. If there are any spaces in it, 
the entire header must be quoted. If you want the header to have the 
chapter number in it, Use the string \\\\n(ch. For example, to number 
appendixes A.1 etc, type .++ RA ~\\\\n(ch.%’. Each section 
(chapter, appendix, etc.) should be preceeded by the .+e request. It 
should be mentioned that it is easier when using TROFF to put the 
front material at the end of the paper, so that the table of contents 
can be collected and output; this material can then be physically 
moved to the beginning of the paper. 


Begin chapter with title T. The chapter number is maintained in 
\n(ch. This register is incremented every time .+c is called with a 
parameter. The title and chapter number are printed by .$c. The 
header is moved to the footer on the first page of each chapter. If T is 
omitted, .$e is not called; this is useful for doing your own “title page” 
at the beginning of papers without a title page proper. .$c calls .6C 
as a hook so that chapter titles can be inserted into a table of contents 
automatically. The footnote numbering is reset to one. 


Print chapter number (from \n(ch) and T. This macro can be 
redefined to your liking. It is defined by default to be acceptable for a 
PhD thesis at Berkeley. This macro calls $C, which can be defined to 
make index entries, or whatever. 


This macro is called by .$e. It is normally undefined, but can be used 
to automatically insert index entries, or whatever. K is a keyword, 
either “Chapter” or “Appendix” (depending on the .++ mode); N is 
the chapter or appendix number, and T is the chapter or appendix 
title. 


This macro (short for .acm) sets up the NROFF environment for 
photo-ready papers as used by the ACM. This format is 25% larger, 
and has no headers or footers. The author’s name A is printed at the 
bottom of the page (but off the part which will be printed in the 
conference proceedings), together with the current page number and 
the total number of pages N. Additionally, this macro loads the file 
/usr/lib/me/acm.me, which may later be augmented with other mac- 
ros useful for printing papers for ACM conferences. It should be 
noted that this macro will not work correctly in TROFF, since it sets 
the page length wider than the physical width of the phototypesetter 
roll. 
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12. Predefined Strings 


‘el 


\# 
\*[ 


\*] 
\te 


\t> 
(dw 
\*(mo 
\*(td 


\*(Iq 
\*(rg 
\e— 


Footnote number, actually \*[\n($f\*]. This macro is incremented 
after each call to .)f. 


Delayed text number. Actually [\n($d]. 


Superscript. This string gives upward movement and a change to a 
smaller point size if possible, otherwise it gives the left bracket charac- 
ter (‘[’). Extra space is left above the line to allow room for the super- 
script. 

Unsuperscript. Inverse to \*[. For example, to produce a superscript 
you might type x\*[2\*], which will produce x?. 

Subscript. Defaults to ‘<’ if half-carriage motion not possible. Extra 
space is left below the line to allow for the subscript. 

Inverse to \¥<. 

The day of the week, as a word. 

The month, as a word. 

Today’s date, directly printable. The date is of the form April 8, 1984. 
Other forms of the date can be used by using \n(dy (the day of the 
month; for example, 8), \*(mo (as noted above) or \n(mo (the same, 


but as an ordinal number; for example, April is 4), and \n(yr (the last 
two digits of the current year). 


Left quote marks. Double quote in NROFF. 
Right quote. 
¥%4 em dash in TROFF; two hyphens in NROFF. 


13. Special Characters and Marks 


There are a number of special characters and diacritical marks (such as accents) avail- 
able through —me. To reference these characters, you must call the macro .se to define the 
characters before using them. 


SC 


Define special characters and diacritical marks, as described in the 
remainder of this section. This macro must be stated before initializa- 
tion. 


The special characters available are listed below. 


Name 

Acute accent 
Grave accent 
Umlat 

Tilde 

Caret 
Cedilla 
Czech 

Circle 

There exists 
For all 


Usage Example 
* a 


a\*’ a 
\* e\** é 
\*: u\*: u 
\* ny n 
ie e\** é 
ve aa ¢ 
\Fy e\tyv e 
\*o A\*o A 
\*(ge | 
\¥ (qa \ 
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NROFF/TROFF User’s Manual 


Joseph F. Ossanna 


Bell Laboratories 
Murray Hill, New Jersey 07974 


Introduction 


NROFF and TROFF are text processors under the PDP-11 UNIX Time-Sharing System! that format text 
for typewriter-like terminals and for a Graphic Systems phototypesetter, respectively. They accept lines 
of text interspersed with lines of format control information and format the text into a printable, 
paginated document having a user-designed style. NROFF and TROFF offer unusual freedom in docu- 
ment styling, including: arbitrary style headers and footers; arbitrary style footnotes; multiple automatic 
sequence numbering for paragraphs, sections, etc; multiple column output; dynamic font and point-size 
control, arbitrary horizontal and vertical local motions at any point; and a family of automatic overstrik- 
ing, bracket construction, and line drawing functions. 


NROFF and TROFF are highly compatible with each other and it is almost always possible to prepare 
input acceptable to both. Conditional input is provided that enables the user to embed input expressly 
destined for either program. NROFF can prepare output directly for a variety of terminal types and is 
capable of utilizing the full resolution of each terminal. 


Usage 
The general form of invoking NROFF (or TROFF) at UNIX command level is 
nroff options files (or troff options files) 


where options represents any of a number of option arguments and files represents the list of files con- 
taining the document to be formatted. An argument consisting of a single minus (—) is taken to be a 
file name corresponding to the standard input. If no file names are given input is taken from the stan- 
dard input. The options, which may appear in any order so long as they appear before the files, are: 


Option Effect 


~olist Print only pages whose page numbers appear in list, which consists of comma- 
separated numbers and number ranges. A number range has the form N-M and 
means pages N through M; a initial —N means from the beginning to page N; and 
a final N— means from N to the end. 


—naN Number first generated page N. 


-sN Stop every N pages. NROFF will halt prior to every N pages (default N=1) to 
allow paper loading or changing, and will resume upon receipt of a newline. 
TROFF will stop the phototypesetter every N pages, produce a trailer to allow 
changing cassettes, and will resume after the phototypesetter START button is 
pressed. 


—maname Prepends the macro file /usr/lib/tmac. name to the input fies. 
—raN Register a (one-character) is set to N. 

—j Read standard input after the input files are exhausted. 

—4q Invoke the simultaneous input-output mode of the rd request. 
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NROFF Only 


—Tname Specifies the name of the output terminal type. Currently defined names are 37 
for the (default) Model 37 Teletype®, tn300 for the GE TermiNet 300 (or any ter- 
minal without haif-line capabilities), 300S for the DASI-300S, 300 for the DASI- 
300, and 450 for the DASI-450 (Diablo Hyterm). 


—e Produce equaily-spaced words in adjusted lines, using full terminal resolution. 
TROFF Only 

—t Direct output to the standard output instead of the phototypesetter. 

—f Refrain from feeding out paper and stopping phototypesetter at the end of the run. 

—w Wait until phototypesetter is available, if currently busy. | 

—b TROFF will report whether the phototypesetter is busy or available. No text pro- 
cessing is done. 

—a Send a printable (ASCII) approximation of the results to the standard output. 

—pN Print all characters in point size MN while retaining all prescribed spacings and 


motions, to reduce phototypesetter elasped time. 
—% Prepare output for the Murray Hill Computation Center phototypesetter and direct 
it to the standard output. 
Each option is invoked as a separate argument; for example, 
nroff ~04,8—10 ~T300S ~—mabc filel file2 


requests formatting of pages 4, 8, 9, and 10 of a document contained in the files named flle/ and file2, 
specifies the output terminal as a DASI-300S, and invokes the macro package adc. 


Various pre- and post-processors are available for use with NROFF and TROFF. These include the 
equation preprocessors NEQN and EQN? (for NROFF and TROFF respectively), and the table- 
construction preprocesser TBL?. A reverse-line postprocessor COL‘ is available for multiple-column 
NROFF output on terminals without reverse-line ability, COL expects the Model 37 Teletype escape 
sequences that NROFF produces by default. TK* is a 37 Teletype simulator postprocessor for printing 
NROFF output on a Tektronix 4014. TCAT‘ is phototypesetter-simulator postprocessor for TROFF that 
produces an approximation of phototypesetter output on ~ Tektronix 4014. For example, in 


tbl files | eqn | troff —t options | tcat 


the first | indicates the piping of TBL’s output to EQN’s input; the second the piping of EQN’s output to 
TROFF’s input; and the third indicates the piping of TROFF’s output to TCAT. GCAT* can be used to 
send TROFF (—g) output to the Murray Hill Computation Center. 


The remainder of this manual consists of: a Summary and Index; a Reference Manual keyed to the 
index; and a set of Tutorial Examples. Another tutorial is [5]. 


Joseph F. Ossanna 
References 
{1] K. Thompson, D. M. Ritchie, UNIX Programmer's Manual, Sixth Edition (May 1975). 


(2] B. W. Kernighan, L. L. Cherry, Typesetting Mathematics — User’s Guide (Second Edition), Bell Laboratories 
internal memorandum. 


[3] M. E. Lesk, 7af — A Program to Format Tables, Beil Laboratories internal memorandum. 
{4] Internal on-line documentation, on UNIX. 


(5} B. W. Kernighan, A TROFF Tutorial, Beil Laboratories internal memorandum. 


Request Initial If No 
Form Vailue* Argument 


1. General Explanation 
2. Font and Character Size Control 
—psiN 10 point previous 


oss N 12/36em _ignored 
csFNM off : 
.bd FN off - 
bd S FN off : 
ft F Roman previous 


fp NF R,1,B,S ignored 
3. Page Control 


pl+N llin llin 

-bp +N N=} : 

—pn +N N=} ignored 
po +N 0; 26/27in previous 
ne N - Nel V 
mk R none internal 
It iN none internal 
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SUMMARY AND INDEX 


Notes# Explanation 


‘mvuvvumm 


v 
Bt,yv 
v 
D,v 
D 
D,v 


4. Text Filling, Adjusting, and Centering 


br 


fi fill - 

nf fill - 

.ad c adj, both adjust 
na adjust - 

ce N off Ne} 

5. Vertical Spacing 

.vs N 1/6in;12pts previous 
is N N=} previous 
sp N : NelV 
sv N : Nel V 
0S - . 

ns space : 

.TS - - 


6. Line Length and Indenting 


1 tN 6.5 in previous 

in +N N= Q previous 

ti tN : ignored 

1. Macros, Strings, Diversion, and Position 
dexyy - yy™.. 

am xyy - Jy™.. 

dS xx String - ignored 

.aS XX String - ignored 


tm tn 


B 
B 
B, 
E 
E 
B 


E 


o0'<9omm 
< a] 


E,m 
B,E,m 
B,E,m 


Point size; also \siN.f 

Space-character size set to N/36em.t 

Constant character space (width) mode (font F).f 
Embolden font F by N—1 units.f 

Embolden Special Font when current font is F.f 
Change to font F = x, xx, or 1-4. Also \fx, \f(xx, \EN. 
Font named F mounted on physical position 1< N< 4. 


Page length. 

Eject current page; next page number N. 
Next page number N. 

Page offset. 

Need N vertical space (V = vertical spacing). 
Mark current vertical place in register R. 
Return (upward only) to marked vertical place. 


Break. 

Fill output lines. 

No filling or adjusting of output lines. 
Adjust output lines with mode c. 

No output line adjusting. 

Center following N input text lines. 


Vertical base line spacing (V). 

Output N—1 Vs after each text output line. 
Space vertical distance N in either direction. 
Save vertical distance N. 

Output saved vertical distance. 

Turn no-space mode on. 

Restore spacing; turn no-space mode off. 


Line length. 
Indent. 
Temporary indent. 


Traps 


Define or redefine macro xx; end at call of yy. 
Append to a macro. 

Define a string xx containing string. 

Append string to string xx. 


“Values separated by ";” are for NROFF and TROFF respectively. 
#Notes are explained at the end of this Summary and Index 


TNo effect in NROFF. 


tThe use of * °” as control character (instead of ”.”) suppresses the break function. 
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Request Initial If No 

Form Value Argument Notes Explanation 

rm x - ignored : Remove request, macro, or string. 

mxxyy ignored - Rename request, macro, or string xx to yy. 

di x - end D Divert output to macro xx. 

da xx - end D Divert and append to xx. 

wh Vx - - v ~ Set location trap; negative is w.r.t. page bottom. 
chxN - - v Change trap location. 

at N x “ off D,yv Set a diversion trap. 

it Nx - off E Set an input-line count trap. 

em xx none none : End macro is xx. 

8. Number Registers 

nmr Rt+tNM - u Define and set number register R; auto-increment by M. 
af Re arabic : : Assign format to register R (c™1, i, I, a, A). 

wr R : - - Remove register 2. 

9. Tabs, Leaders, and Fields 

ta Ne... 0.8: 0.5in none E,m Tab settings; /eff type, unless ¢=R (right), C(centered). 
tec none none E Tab repetition character. 

lec , none E Leader repetition character. 

foab off off - Set field delimiter @ and pad character 0. 

10. Input and Output Conventions and Character Translations 

ee c \ \ - Set escape character. 

20 on : - Turn off escape character mechanism. 

dig N -, on on - Ligature mode on if N>0. 

ul N off N=] E Underline (italicize in TROFF) N input lines. 

cu V off N=} E Continuous underline in NROFF; like ul in TROFF. 
uf F Italic Italic : Underline font set to F (to be switched to by ul). 
ce c E Set control character to c. 

2c : E Set nobreak control character to c. 

.tr abcd.... mone * QO Translate ato 6, etc. on output. 


11. Local Horizontal and Vertical Motions, and the Width Function 
12. Overstrike, Bracket, Line-drawing, and Zero-width Functions 


13. Hyphenation. 


nh hyphenate - E 
chy NV hyphenate hyphenate E 
-he c \% \% E 
-hw word] ... ignored : 
14. Three Part Titles. 

etl ‘left’ center’ right’ : : 
-pe c % off - 
JAIt +N 6.Sin previous E,m 
15. Output Line Numbering. 

am tNMSI1 off E 
nn N : New} E 
16. Conditional Acceptance of Input 
if c anything : : 


No hyphenation. 

Hyphenate; V = mode. 
Hyphenation indicator character c. 
Exception words. 


Three part title. 
Page number character. 
Length of title. 


Number mode on or off, set parameters. 
Do not number next VN lines. 


If condition c true, accept anything as input, 
for multi-line use \{anvehing\}. 


Request Initial If No 

Form Value Argument Notes 
if !c anything - - 

if N anything - u 

if !N anything : u 

aif ‘stringl’ string2’ anything - 

Af {stringl‘string2’ anything - 

le ¢ anything - 

el anything - - 


17. Environment Switching. 
ev N N=0 
18. Insertions from the Standard Input 


previous - 


rd prompt - prompt =BEL - 
ex : : : 


19. Input/Output File Switching 


.$0 filename 


nx filename end-of-file - 


-pi program - : 
20. Miscellaneous 

-mecN + off E,m 
.tm string - newline - 
ig yy : JY. - 
-pm ¢ - all - 

fl : : B 


21. Output and Error Messages 
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Explanation 


If condition c false, accept anything. 

If expression N > 0, accept anything. 

If expression N < 0, accept anything. 

If stringl identical to string2, accept anything. 

If string1 not identical to string2, accept anything. 
If portion of if-else; all above forms (like if). 
Else portion of if-else. 


Environment switched (push down). 


Read insertion. 
Exit from NROFF/TROFF. 


Switch source file (push down). 
Next file. 
Pipe output to program (NROFF only). 


Set margin character c and separation N. 

Print string on terminal (UNIX standard message output). 
Ignore till call of yy. 

Print macro names and sizes; 

if ¢ present, print only total of sizes. 

Flush output buffer. 





Notes- 
B Request normally causes a break. 


D Mode or relevant parameters associated with current diversion level. 

E Relevant parameters are a part of the current environment. 

O — Must stay in effect until logical output. 

P Mode must be still or again in effect at the time of physical output. 
v,p,m,u Default scale indicator; if not specified, scale indicators are ignored. 


Alphabetical Request and Section Number Cross Reference 


ad 4 ce 10 ds 7 fo 9 ie 16 


af 8 ce 4 dt 7 fi 4 if 16 
am 7 ch 7 ec 10 fl 20 ig 20 
as 7 cs 2 ei 16 fp 2 in 6 
bd 2 cu 10 em 7 ft 2 it 7 
bp 3 da 7 eo 10 he 13 Ic 9 
br 4 de 7 ev 17 hw 13 lg 10 
c2 10 di 7 ex 18 hy 13 li 10 


6 nh 13 pi 19 m 7 ta 9 vs § 
Is 5§ nm 15 pl 3 rr 68 tc 9 wh 7 
It 14 an 15 pm 20 rs § i 6 
me 20 ne 8 pn 3 rt 3 ud 14 
mk 3 ns § po 3 so 19 tm 20 
na 4 nx 19 ps 2 sp 5 tr 10 
ne 3 os § rd 18 ss 2 uf 10 
nf 4 pe 14 rm 7 sv § ul 10 
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Escape Sequences for Characters, Indicators, and Functions 


Section Escape 


Reference Sequence Meaning 
10.1 \\ \ (to prevent or delay the interpretation of \) 
10.1 \e Printable version of the current escape character. 
2.1 \ * (acute accent); equivalent to \(aa 
2.1 \ * (grave accent); equivalent to \(ga 
2.1 \- — Minus sign in the current font 
7 \; Period (dot) (see de) 
11.1 \(space) Unpaddable space-size space character 
11.1 \0 Digit width space 
11.1 \| 1/6em narrow space character (zero width in NROFF) 
11.1 \ 1/12 em half-narrow space character (zero width in NROFF) 
4.1 \& Non-printing, zero width character 
10.6 \! Transparent line indicator 
10.7 \" Beginning of comment 
7.3 \SN Interpolate argument 1 <.N<9 
13 \% Default optional hyphenation character 
2.1 \(xxe Character named xx 
7.1 \ex, \9(xx Interpoilate string x or xx 
9.1 \a Non-interpreted leader character 
12.3 \b’ abc..." Bracket building function 
4.2 \c Interrupt text processing 
11.1 \d Forward (down) 1/2em vertical motion (1/2 line in NROFF) 
2.2 \fx,\fCxex,\f{N Change to font named x or xx, or position NV 
11.1 \h'N’ Local horizontal motion; move right N (negative left) 
11.3 \kx Mark horizontal input place in register x 
12.4 \I' Ne’ Horizontal line drawing function (optionally with c) 
12.4 \L’ Ne" Vertical line drawing function (optionally with c) 
8 \nx,\n (xx Interpolate number register x or xx 
12.1 \o' abc..." Overstrike characters a, 5, ¢, ... 
4.1 \p Break and spread output line 
11.1 \r Reverse 1 em vertical motion (reverse line in NROFF) 
2.3 \sN,\s=N  Point-size change function 
9.1 \t Non-interpreted horizontal tab 
11.1 \u Reverse (up) 1/2em vertical motion (1/2 line in NROFF) 
11.1 \vN’ Local vertical motion; move down N (negative up) 
11.2 \w’ string’ Interpolate width of string 
5.2 \x'N“ Extra line-space function (negative before, positive after) 
12.2 \ze Print c with zero width (without spacing) 
16 \{ Begin conditional input 
16 \} End conditional input 
10.7 \ (newline) Concealed (ignored) newline 
: \X X, any character aot listed above 


The escape sequences \\, \., \", \S, \*, \a, \n. \t, and \(newline) are interpreted in copy mode ($7.2). 
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Predefined General Number Registers 


Section 
Reference 


3 
11.2 

7.4 

7.4 


11.3 
15 
4.1 
11.2 
11.2 


Register 
Name 


% 
ct 
dl 
dn 
dw 
dy 
hp 
in 
mo 
nl 
sb 
st 
yr 


Description 


Current page number. 

Character type (set by width function). 

Width (maximum) of last completed diversion. 

Height (vertical size) of last completed diversion. 

Current day of the week (1-7). 

Current day of the month (1-31). 

Current horizontal place on input line. 

Output line number. 

Current month (1-12). 

Vertical position of last printed text base-line. 

Depth of string below base line (generated by width function). 
Height of string above base line (generated by width function). 
Last two digits of current year. 


Predefined Read-Only Number Registers 


Section 
Reference 


7.3 
11.1 
11.1 

5.2 

7.4 

2.2 


— 
Ne = MM 


= 
‘p 


Register 
Name 


SO ee oe a od oe ee Bee) 


Description 


Number of arguments available at the current macro level. 
Set to 1 in TROFF, if —a option used; always 1 in NROFF. 
Available horizontal resolution in basic units. 

Set to 1 in NROFF, if ~T option used; always 0 in TROFF. 
Available vertical resolution in basic units. 

Post-line extra line-space most recently utilized using \x’N’. 
Number of lines read from current input file. 

Current vertical place in current diversion; equal to nl, if no diversion. 
Current font as physical quadrant (1-4). 

Text base-line high-water mark on current page or diversion. 
Current indent. 

Current line length. 

Length of text portion on previous output line. 

Current page offset. 

Current page length. 

Current point size. 

Distance to the next trap. 

Equal to 1 in fill mode and 0 in nofill mode. 

Current vertical line spacing. 

Width of previous character. 

Reserved version-dependent register. 

Reserved version-dependent register. 

Name of current diversion. 
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REFERENCE MANUAL 


1. General Explanation 


1.1. Form of input. Input consists of text lines, which are destined to be printed, interspersed with control 
lines, which set parameters or otherwise control subsequent processing. Control lines begin with a con- 
trol character—normailly . (period) or ° (acute accent)—followed by a one or two character name that 
specifies a basic request or the substitution of a user-defined macro in place of the control line. The 
control character ° suppresses the break function—the forced output of a partially filled line—caused by 
certain requests. The control character may be separated from the request/macro name by white space 
(spaces and/or tabs) for esthetic reasons. Names must be followed by either space or newline. Control 
lines with unrecognized names are ignored. 


Various special functions may be introduced anywhere in the input by means of an escape character, 
normally \. For example, the function \n& causes the interpolation of the contents of the number regis- 
ter R in place of the function; here R is either a single character name as in \nx, or left-parenthesis- 
introduced, two-character name as in \n (xx. 


1.2. Formatter and device resolution. TROFF internally uses 432 units/inch, corresponding to the Graphic 
Systems phototypesetter which has a horizontal resolution of 1/432 inch and a vertical resolution of 
1/144 inch. NROFF internally uses 240 units/inch, corresponding to the least common multiple of the 
horizontal and vertical resolutions of various typewriter-like output devices. TROFF rounds 
horizontal/vertical numerical parameter input to the actual horizontal/vertical resolution of the Graphic 
Systems typesetter. NROFF similarly rounds numerical input to the actual resolution of the output dev- 
ice indicated by the ~T option (default Model 37 Teletype). 


1.3. Numerical parameter input. Both NROFF and TROFF accept numerical input with the appended scale 
indicators shown in the following table, where Sis the current type size in points, Vis the current verti- 
cal line spacing in basic units, and Cis a nominal character width in basic units. 


Scale Number of basic units 
Indicator Meaning TROFF NROFF 


Inch 432 240 
Centimeter 432x50/127 | 240x50/127 
Pica = 1/6 inch 72 240/6 


Em = S points 6xS§ C 

En = Em/2 3xS C, same as Em 
Point = 1/72 inch 240/72 

Basic unit l 

Vertical line space V 

Default, see below 





In NROFF, doth the em and the en are taken to be equal to the C, which is output-device dependent; 
common values are 1/10 and 1/12 inch. Actual character widths in NROFF need not be all the same 
and constructed characters such as ~> (—) are often extra wide. The default scaling is ems for the 
horizontaily-oriented requests and functions lI, in, ti, ta, lt, po, mc, \h, and \I; Vs for the vertically- 
oriented requests and functions pl, wh, ch, dt, sp, sv, ne, rt, \v, \x, and \L; p for the vs request; and 
u for the requests nr, if, and ie. A/l other requests ignore any scale indicators. When a number regis- 
ter containing an already appropriately scaled number is interpolated to provide numerical input, the 
unit scale indicator u may need to be appended to prevent an additional inappropriate default scaling. 
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The number, N, may be specified in decimal-fraction form but the parameter finally stored is rounded 
to an integer number of basic units. 


The absolute position indicator | may be prepended to a number MN to generate the distance to the vertical 
or horizontal place N. For vertically-oriented requests and functions, |N becomes the distance in basic 
units from the current vertical place on the page or in a diversion ($7.4) to the the vertical place V. For 
ail other requests and functions, | V becomes the distance from the current horizontal place on the input 
line to the horizontal place NV. For example, 


sp |3.2c 
will space in the required direction to 3.2 centimeters from the top of the page. 


1.4. Numerical expressions. Wherever numerical input is expected an expression involving parentheses, 
the arithmetic operators +, —, /, *, % (mod), and the logical operators <, >, <=, >=, = (or ==), 
& (and), : (or) may be used. Except where controlled by parentheses, evaluation of expressions is 
left-to-right; there is no operator precedence. In the case of certain requests, an initial + or — is 
stripped and interpreted as an increment or decrement indicator respectively. In the presence of default 
scaling, the desired scale indicator must be attached to every number in an expression for which the 
desired and default scaling differ. For example, if the number register x contains 2 and the current 
point size is 10, then 


A (4.25i+\nxP+3)/2u 
will set the line length to 1/2 the sum of 4.25 inches + 2 picas + 30 points. 


1.5. Notation. Numerical parameters are indicated in this manual in two ways. +N means that the 
argument may take the forms N, +N, or —N and that the corresponding effect is to set the affected 
parameter to N, to increment it by N, or to decrement it by N respectively. Plain N means that an ini- 
tial algebraic sign is mot an increment indicator, but merely the sign of N. Generally, unreasonable 
numerical input is either ignored or truncated to a reasonable value. For example, most requests 
expect to set parameters to non-negative values, exceptions are sp, wh, ch, nr, and if. The requests 
ps, ft, po, vs, Is, ll, in, and It restore the previous parameter value in the absence of an argument. 


Single character arguments are indicated by single lower case letters and one/two character arguments 
are indicated by a pair of lower case letters. Character string arguments are indicated by multi-character 
mnemonics. 


2. Font and Character Size Control 


2.1. Character set. The TROFF character set consists of the Graphics Systems Commercial I] character 
set plus a Special Mathematical Font character set—each having 102 characters. These character sets 
are shown in the attached Table I. All ASCII characters are included, with some on the Special Font. 
With three exceptions, the ASCII characters are input as themselves, and non-ASCII characters are input 
in the form \(xx where xx is a two-character name given in the attached Table I]. The three ASCII 
exceptions are mapped as follows: 
















ASCII Input 
Character Name 
acute accent 
grave accent 
minus 


Printed by TROFF 
Character Name 

close quote 
open quote 
hyphen 
















The characters °, ‘, and — may be input by \’, \‘, and \— respectively or by their names (Table II). 
The ASCII characters @, #,°, ’, *, <, >, \, {, }, ~ *, amd _ exist only on the Special Font and are 
printed as a l-em space if that Font is not mounted. 


NROFF understands the entire TROFF character set, but can in general print only ASCII characters, 
additional characters as may be available on the output device, such characters as may be able to be 
constructed by overstriking or other combination, and those that can reasonably be mapped into other 
printable characters. The exact behavior is determined by a driving table prepared for each device. The 
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characters ", °, and _ print as themselves. 


2.2. Fonts. The default mounted fonts are Times Roman (R), Times Italic (1), Times Boid (B), and 
the Special Mathematical Font (S) on physical typesetter positions 1, 2, 3, and 4 respectively. These 
fonts are used in this document. The current font, initially Roman, may be changed (among the 
mounted fonts) by use of the ft request, or by imbedding at any desired point either \fx, \f(2xx, or \{N 
where x and xx are the name of a mounted font and N is a numerical font position. It is 2ot necessary 
to change to the Special font, characters on that font are automatically handled. A request for a named 
but not-mounted font is ignored. TROFF can be informed that any particular font is mounted by use of 
the fp request. The list of known fonts is installation dependent. In the subsequent discussion of 
font-related requests, F represents either a one/two-character font name or the numerical font position, 
1-4. The current font is available (as numerical position) in the read-only number register .f. 


NROFF understands font control and normally underlines Italic characters (see $10.5). 


2.3. Character size. Character point sizes available on the Graphic Systems typesetter are 6, 7, 8, 9, 10, 
11, 12, 14, 16, 18, 20, 22, 24, 28, and 36. This is a range of 1/12 inch to 1/2 inch. The ps request is 
used to change or restore the point size. Alternatively the point size may be changed between any two 
characters by imbedding a \sN at the desired point to set the size to N, or a \stN (1<NS<9) to 
increment/decrement the size by M \sO restores the previous size. Requested point size values that are 
between two valid sizes yield the larger of the two. The current size is available in the .s register. 
NROFF ignores type size control. 


Request Initial If No 
Form Value Argument Notes* Explanation 


ps tN 10 point previous E Point size set to +N. Alternatively imbed \sN or \s+N. 
Any positive size value may be requested; if invalid, the 
next larger valid size will result, with a maximum of 36. 
A paired sequence +N, —N will work because the previ- 
ous requested value is also remembered. Ignored in 
NROFF. 


ss N 12/36em _ ignored E Space-character size is set to N/36ems. This size is the 
minimum word spacing in adjusted text. Ignored in 
NROFF. 


c<sFNM off - P Constant character space (width) mode is set on for font 
F (if mounted); the width of every character will be 
taken to be N/36 ems. If Mis absent, the em is that of 
the character’s point size, if Af is given, the em is M- 
points. All affected characters are centered in this space, 
including those with an actual width larger than this 
space. Special Font characters occurring while the 
current font is F are also so treated. If N is absent, the 
mode is turned off. The mode must be still or again in 
effect when the characters are physically printed. Ignored 
in NROFF. 


bd FN off - P The characters in font F will be artificially emboldened by 
printing each one twice, separated by N—1 basic units. A 
reasonable value for N is 3 when the character size is in 
the vicinity of 10 points. If Nis missing the embolden 
mode is turned off. The column heads above were 
printed with .bd I 3. The mode must be still or again in 
effect when the characters are physically printed. Ignored 
in NROFF. 


penn 


“Notes are explained at the end of the Summary and Index above. 
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bd S FN off - P The characters in the Special Font will be emboldened 
whenever the current font is F. This manual was printed 
with .bd SB3. The mode must be still or again in effect 
when the characters are physically printed. 


ft F Roman previous E Font changed to F. Alternatively, imbed \fF The font 
name P is reserved to mean the previous font. 
fp NF R,1,B,S ignored : Font position. This is a statement that a font named Fis 


mounted on position N (1-4). It is a fatal error if F is 
not known. The phototypesetter has four fonts physically 
mounted. Each font consists of a film strip which can be 
mounted on a numbered quadrant of a wheel. The 
default mounting sequence assumed by TROFF is R, I, B, 
and S on positions 1, 2, 3 and 4. 


3. Page control 


Top and bottom margins are not automatically provided; it is conventional to define two macros and to 
set traps for them at vertical positions 0 (top) and ~N (N from the bottom). See §7 and Tutorial 
Examples §T2. A pseudo-page transition onto the first page occurs either when the first break occurs or 
when the first non-diverted text processing occurs. Arrangements for a trap to occur at the top of the 
first page must be completed before this transition. In the following, references to the current diversion 
(§7.4) mean that the mechanism being described works during both ordinary and diverted output (the 
former considered as the top diversion level). 


The useable page width on the Graphic Systems phototypesetter is about 7.54 inches, beginning about 
1/27 inch from the left edge of the 8 inch wide, continuous roll paper. The physical limitations on 
NROFF output are output-device dependent. 


Request Initial Uf No 
Form Value Argument Notes Explanation 
pl +N llin llin v Page length set to +N. The internal limitation is about 


75 inches in TROFF and about 136 inches in NROFF. 
The current page length is available in the .p register. 


bp +N N=} - B*,v Begin page. The current page is ejected and a new page 
is begun. If +N is given, the new page number will be 
+N. Also see request ns. 


-_pn +N Nw} ignored : Page number. The next page (when it occurs) will have 
the page number +N. A pn must occur before the ini- 
tial pseudo-page transition to effect the page number of 
the first page. The current page number is in the % 
register. 


po +N 0; 26/27 inf previous v Page offset. The current left margin is set to +N. The 
TROFF initial value provides about | inch of paper mar- 
gin including the physical typesetter margin of 1/27 inch. 
In TROFF the maximum (line-length) + (page-offset) is 
about 7.54 inches. See §6. The current page offset is 
available in the .o register. 

ne N - Nel V D,y Need WN vertical space. If the distance, D, to the next 
trap position (see §7.5) is less than N, a forward vertical 


space of size D occurs, which will spring the trap. If 
there are no remaining traps on the page, D is the 


*The use of °°” as controi character (instead of *.”) suppresses the break function. 
Values separated by ",” are for NROFF and TROFF respectively. 
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distance to the bottom of the page. If D< V, another 
line could still be output and spring the trap. In a diver- 
sion, D is the distance to the diversion trap, if any, or is 
very large. 


mk R none internal D Mark the current vertical place in an internal register 
(both associated with the current diversion level), or in 
register R, if given. See rt request. 


wtiN none internal D,v Return upward only to a marked vertical place in the 
current diversion. If +N (w.r.t. current place) is given, 
the place is +N from the top of the page or diversion or, 
if Vis absent, to a place marked by a previous mk. Note 
that the sp request ($5.3) may be used in all cases 
instead of rt by spacing to the absolute place stored in a 
explicit register, e.g. using the sequence .mk R ... 
sp |\nRu. 


4. Text Filling, Adjusting, and Centering 


4.1. Filling and adjusting. Normally, words are collected from input text lines and assembled into a out- 
put text line until some word doesn’t fit. An attempt is then made the hyphenate the word in effort to 
assemble a part of it into the output line. The spaces between the words on the output line are then 
increased to spread out the line to the current line length minus any current indent. A word is any string 
of characters delimited by the space character or the beginning/end of the input line. Any adjacent pair 
of words that must be kept together (neither split across output lines nor spread apart in the adjustment 
process) can be tied together by separating them with the unpaddable space character "\ " (backslash- 
space). The adjusted word spacings are uniform in TROFF and the minimum interword spacing can be 
controlled with the ss request ($2). In NROFF. they are normally nonuniform because of quantization 
to character-size spaces; however, the command line option —e causes uniform spacing with full output 
device resolution. Filling, adjustment, and hypnenation ($13) can all be prevented or controlled. The 
text length on the last line output is available in the .n register, and text base-line position on the page 
for this line is in the nl register. The text base-line high-water mark (lowest place) on the current page 
is in the .h register. 


An input text line ending with ., ?, or ! is taken to be the end of a sentence, and an additional space 
character is automatically provided during filling. Multiple inter-word space characters found in the 
input are retained, except for trailing spaces; initial spaces also cause a break. 


When filling is in effect, a \p may be imbedded or attached to a word to cause a break at the end of the 
word and have the resulting output line spread out to fill the current line length. 


A text input line that happens to begin with a control character can be made to not look like a control 
line by prefacing it with the non-printing, zero-width filler character \&. Still another way is to specify 
Output translation of some convenient character into the control character using tr (§10.5). 


4.2. Interrupted text. The copying of a input line in zo/iil (non-fill) mode can be interrupted by terminat- 
ing the partial line with a \c. The ext encountered input text line will be considered to be a continua- 
tion of the same line of input text. Similarly, a word within filled text may be interrupted by terminat- 
ing the word (and line) with \c; the next encountered text will be taken as a continuation of the inter- 
rupted word. If the intervening control lines cause a break, any partial line will be forced out along 
with any partial word. 


Request Initial If No 
Form Value Argument Notes Explanation 
-br : - B Break. The filling of the line currently being collected is 


stopped and the line is output without adjustment. Text 
lings beginning with space characters and empty text 
lines (blank lines) also cause a break. 
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fi fill on - B,E Fill subsequent output lines. The register .u is | in fill 
mode and 0 in nofill mode. 


nf fill on - B,E Nofill. Subsequent output lines are neither filled zor 
adjusted. Input text lines are copied directly to output 
lines without regard for the current line length. 


-ad c adj,both adjust E Line adjustment is begun. If fill mode is not on, adjust- 
ment will be deferred until fill mode is back on. If the 
type indicator c is present, the adjustment type is 
changed as shown in the following table. 


indicator | Adjust Type 


adjust left margin only 
r adjust right margin only 


c center 
born adjust both margins 
absent unchanged 





na adiust : E Noadjust. Adjustment is turned off; the right margin will 
be ragged. The adjustment type for ad is not changed. 
Output line filling still occurs if fill mode is on. 


.ce NV off New|} BE Center the next N input text lines within the current 
(line-length minus indent). If N=0, any residual count 
is cleared. A break occurs after each of the MN input 
lines. If the input line is too long, it will be left adjusted. 


5. Vertical Spacing 


5.1. Base-line spacing. The vertical spacing (V) between the base-lines of successive output lines can be 
set using the vs request with a resolution of 1/144 inch = 1/2 point in TROFF, and to the output device 
resolution in NROFF. V must be large enough to accommodate the character sizes on the affected out- 
put lines. For the common type sizes (9-12 points), usual typesetting practice is to set V to 2 points 
greater than the point size; TROFF default is 10-point type on a 12-point spacing (as in this document). 
The current V is available in the .v register. Multiple-V line separation (e.g. double spacing) may be 
requested with ls. 


5.2. Extra line-space. If a word contains a vertically tall construct requiring the output line containing it 
to have extra vertical space before and/or after it, the extra-line-space function \x’N’ can be imbedded 
in or attached to that word. In this and other functions having a pair of delimiters around their parame- 
ter (here ’), the delimiter choice is arbitrary, except that it can’t look like the continuation of a number 
expression for NV. If N is negative, the output line containing the word will be preceded by N extra 
vertical space; if N is positive, the output line containing the word will be followed by N extra vertical 
space. If successive requests for extra space apply to the same line, the maximum values are used. 
The most recently utilized post-line extra line-space is available in the .a register. 


5.3. Blocks of vertical space. A block of vertical space is ordinarily requested using sp, which honors the 
no-space mode and which does not space past a trap. A contiguous block of vertical space may be 
reserved using sv. 


Request Initial If No 

Form Value Argument Notes Expianation 

.vs N 1/6in;l2pts previous E,p Set vertical base-line spacing size Vv Transient extra 
: vertical space available with \x’N" (see above). 

is N New 1 previous E Line spacing set to +N. N—1 Vs (blank lines) are 


appended to each output text line. Appended blank lines 
are omitted, if the text or previous appended blank line 
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reached a trap position. 


sp N : Nel V B,y Space vertically in either direction. If N is negative, the 
motion is backward (upward) and is limited to the dis- 
tance to the top of the page. Forward (downward) 
motion is truncated to the distance to the nearest trap. If 
the no-space mode is on, no spacing occurs (see ns, and 
rs below). 


sv N - Nel V v Save a contiguous vertical block of size N. If the dis- 
tance to the next trap is greater than N, N vertical space 
is output. No-space mode has zo effect. If this distance 
is less than N, no vertical space is immediately output, 
but N is remembered for later output (see os). Subse- 
quent sv requests will overwrite any still remembered N. 


-0S : - : Output saved vertical space. No-space mode has no 
effect. Used to finally output a block of vertical space 
requested by an earlier sv request. 


ns space : D No-space mode turned on. When on, the no-space mode 
inhibits sp requests and bp requests without a next page 
number. The no-space mode is turned off when a line of 
output occurs, or with rs. 


rs space - D Restore spacing. The no-space mode is turned off. 
Blank text line. - B Causes a break and output of a blank line exactly like 
sp 1. 


6. Line Length and Indenting 


The maximum line length for fill mode may be set with Il. The indent may be set with in; an indent 
applicable to only the next output line may be set with ti. The line length includes indent space but zor 
page offset space. The line-length minus the indent is the basis for centering with ce. The effect of ll, 
in, or ti is delayed, if a partially collected line exists, until after that line is output. In fill mode the 
length of text on an output line is less than or equal to the line length minus the indent. The current 
line length and indent are available in registers .] and .i respectively. The length of three-part titles pro- 
duced by tl (sce $!4) is independently set by It. 


Request Initial If No 

Form Value Argument Notes Explanation 

MH tN 6.5 in previous E,m__— Line length is set to +N. In TROFF the maximum 
(line-length) + (page-offset) is about 7.54 inches. 

ain +N N=0 previous B,E,m Indent is set to +N. The indent is prepended to each 
output line. 

ti tN - ignored B,E,m Temporary indent. The ext output text line will be 


indented a distance +N with respect to the current 
indent. The resulting total indent may not be negative. 
The current indent is not changed. 


7. Macros, Strings, Diversion, and Position Traps 


7.1. Macros and strings. A macro is a named set of arbitrary lines that may be invoked by name or with 
a trap. A string is a named string of characters, not including a newline character, that may be interpo- 
lated by name at any point. Request, macro, and string names share the same name list. Macro and 
string names may be one or two characters long and may usurp previously defined request, macro, or 
string names. Any of these entities may be renamed with rn or removed with rm. Macros are created 
by de and di, and appended to by am and da; di and da cause normal output to be stored in a macro. 
Strings are created by ds and appended to by as. A macro is invoked in the same way as a request; a 
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control line beginning .2x will interpolate the contents of macro x««. The remainder of the line may 
contain up to nine arguments. The strings x and xx are interpolated at any desired point with \*x and 
\*(xx respectively. String references and macro invocations may be nested. 


7.2. Copy mode input interpretation. During the definition and extension of strings and macros (not by 
diversion) the input is read in copy mode. The input is copied without interpretation except that: 


-e The contents of number registers indicated by \n are interpolated. 

e Strings indicated by \« are interpolated. 

e Arguments indicated by \§ are interpolated. 

e Concealed newlines indicated by \(newline) are eliminated. 

e Comments indicated by \" are eliminated. 

e \t and \a are interpreted as ASCII horizontal tab and SOH respectively (§9). 
e \\ is interpreted as \. 

e \. is interpreted as ".". 


These interpretations can be suppressed by prepending a \. For example, since \\ maps into a \, \\n 
will copy as \n which will be interpreted as a number register indicator when the macro or string is 
reread. 


7.3. Arguments. When a macro is invoked by name, the remainder of the line is taken to contain up to 
nine arguments. The argument separator is the space character, and arguments may be surrounded by 
double-quotes to permit imbedded space characters. Pairs of double-quotes may be imbedded in 
double-quoted arguments to represent a single double-quote. If the desired arguments won’t fit on a 
line, a concealed newline may be used to continue on the next line. 


When a macro is invoked the input level is pushed down and any arguments available at the previous 
level become unavailable until the macro is completely read and the previous level is restored. A 
macro’s own arguments can be interpolated at any point within the macro with \$JN, which interpolates 
the Nth argument (1<N<9). If an invoked argument doesn’t exist, a null string results. For exam- 
ple, the macro xx may be defined by 


de xx \"begin definition 
Today is \\$1 the \\$2. 
os \"end definition 


and called by 
xx Monday 14th 
to produce the text 
Today is Monday the 14th. 


Note that the \$ was concealed in the definition with a prepended \. The number of currently available 
arguments is in the .§ register. 


No arguments are available at the top (non-macro) level in this implementation. Because string 
referencing is implemented as a input-level push down, no arguments are available from within a string. 
No arguments are available within a trap-invoked macro. 


Arguments are copied in copy mode onto a stack where they are available for reference. The mechan- 
ism does not allow an argument to contain a direct reference to a long string (interpolated at copy time) 
and it is advisable to conceal string references (with an extra \) to delay interpolation until argument 
reference time. 


7.4. Diversions. Processed output may be diverted into a macro for purposes such as footnote processing 
(see Tutorial §T5) or determining the horizontal and vertical size of some text for conditional changing 
of pages or columns. A single diversion trap may be set at a specified vertical position. The number 
registers dn and dl respectively contain the vertical and horizontal size of the most recently ended 
diversion. Processed text that is diverted into a macro retains the vertical size of each of its lines when 
reread in nofiill mode regardless of the current V. Constant-spaced (ces) or emboldened (bd) text that is 
diverted can be reread correctly only if these modes are again or still in effect at reread time. One way 
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to do this is to imbed in the diversion the appropriate cs or bd requests with the transparent mechanism 
described in $10.6. 


Diversions may be nested and certain parameters and registers are associated with the current diversion 
level (the top non-diversion level may be thought of as the Oth diversion level). These are the diver- 
sion trap and associated macro, no-space mode, the internally-saved marked place (see mk and rt), the 
current vertical place (.d register), the current high-water text base-line (.h register), and the current 
diversion name (.z register). 


7.5. Traps. Three types of trap mechanisms are available—page traps, a diversion trap, and an input- 
line-count trap. Macro-invocation traps may be planted using wh at any page position including the top. 
This trap position may be changed using ch. Trap positions at or below the bottom of the page have no 
effect unless or until moved to within the page or rendered effective by an increase in page length. 
Two traps may be planted at the same position only by first planting them at different positions and 
then moving one of the traps; the first planted trap will conceal the second unless and until the first one 
is moved (see Tutorial Examples §T5). If the first one is moved back, it again conceals the second 
trap. The macro associated with a page trap is automatically invoked when a line of text is output 
whose vertical size reaches or sweeps past the trap position. Reaching the bottom of a page springs the 
top-of-page trap, if any, provided there is a next page. The distance to the next trap position is avail- 
able in the .t register; if there are no traps between the current position and the bottom of the page, the 
distance returned is the distance to the page bottom. 


A macro-invocation trap effective in the current diversion may be planted using dt. The .t register 
works in a diversion; if there is no subsequent trap a large distance is returned. For a description of 
input-line-count traps, see it below. 


Request Initial If No 

Form Value Argument Notes Explanation 

de xyy - yy™.. : Define or redefine the macro xx The contents of the 
macro begin on the next input line. Input lines are 
copied in copy mode until the definition is terminated by a 
line beginning with .yy, whereupon the macro yy is 
called. In the absence of yy, the definition is terminated 
by a line beginning with *..". A macro may contain de 
requests provided the terminating macros differ or the 
contained definition terminator is concealed. “.." can be 
concealed as \\.. which will copy as \.. and be reread as 

am xxyy - Jy™.. : Append to macro (append version of de). 

ds xx String - ignored - Define a string xx containing string. Any initial double- 
quote in string is stripped off to permit initial blanks. 

-aS Xx String - ignored ° Append string to string xx (append version of ds). 

rm x - ignored - Remove request, macro, or string. The name x« is 
removed from the name list and any related storage 
space is freed. Subsequent references will have no effect. 

Im xyy - ignored : Rename request, macro, or string xx to yy. If yy exists, it 
is first removed. 

di xx : end D Divert output to macro xx. Normal text processing 


occurs during diversion except that page offsetting is not 
done. The diversion ends when the request di or da is 
encountered without an argument; extraneous requests 
of this type should not appear when nested diversions are 
being used. 
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da xx - end D Divert, appending to xx (append version of di). 


whVx =- - v Install a trap to invoke xx at page position V; a negative N 
will be interpreted with respect to the page bottom. Any 
macro previously planted at N is replaced by xx A zero 
N refers to the top of a page. In the absence of xx, the 
first found trap at N, if any, is removed. 


ch xx N - - v Change the trap position for macro xx to be N. In the 
absence of N, the trap, if any, is removed. 


dt Nx : off D,y Install a diversion trap at position Nin the current diver- 
sion to invoke macro xx. Another dt will redefine the 
diversion trap. If no arguments are given, the diversion 
trap is removed. 


eit Nx : off E Set an input-line-count trap to invoke the macro xx after 
N lines of text input have been read (control or request 
lines don’t count). The text may be in-line text or text 
interpolated by inline or trap-invoked macros. 


em xx none none : The macro xx will be invoked when all input has ended. 
The effect is the same as if the contents of x had been 
at the end of the last file processed. 


8. Number Registers 


A variety of parameters are available to the user as predefined, named number registers (see Summary 
and Index, page 7). In addition, the user may define his own named registers. Register names are one 
or two characters long and do not conflict with request, macro, or string names. Except for certain 
predefined read-only registers, a number register can be read, written, automatically incremented or 
decremented, and interpolated into the input in a variety of formats. One common use of user-defined 
registers is to automatically number sections, paragraphs, lines, etc. A number register may be used 
any time numerical input is expected or desired and may be used in numerical expressions ($1.4). 


Number registers are created and modified using nr, which specifies the name, numerical value, and 
the auto-increment size. Registers are also modified, if accessed with an auto-incrementing sequence. 
If the registers x and <x both contain N and have the auto-increment size M, the following access 
sequences have the effect shown: 


Effect on Value 
Register Interpolated 
N 


none 

























none N 
x incremented by M N+M 
x decremented by N-M 
xx incremented by M N+M 
xx decremented by M N-M 


When interpolated, a number register is converted to decimal (default), decimal with leading zeros, 
lower-case Roman, upper-case Roman, lower-case sequential alphabetic, or upper-case sequential aipha- 
betic according to the format specified by af. 


Request Initiai If No 
Form Value Argument Notes Explanation 
orR+NM - u The number register R is assigned the value +N with 


respect to the previous value, if any. The increment for 
auto-incrementing is set to M. 


5-66 Nroff/Troff Users Manual 


eaf Re arabic : - Assign format cto register R. The available formats are: 


| Numbering 
Format Sequence 
0,1,2,3,4,5,... 
000,001 ,002,003 ,004,005.,... 


0,i,i1,iti,iv,v,... 

OU N,IV,V,... 
0,a,b,c,...,Z,aa,ab,...,2Z,aaa,... 
0,A,B,C,....Z,AA,AB,....ZZ,AAA.... 





An arabic format having N digits specifies a field width of 
N digits (example 2 above). The read-only registers and 
the width function ($11.2) are always arabic. 


rr R - ignored . Remove register R. If many registers are being created 
dynamically, it may become necessary to remove no 
longer used registers to recapture internal storage space 
for newer registers. 


9. Tabs, Leaders, and Fields 


9.1. Tabs and leaders. The ASCII horizontal tab character and the ASCII SOH (hereafter known as the 
leader character) can both be used to generate either horizontal motion or a string of repeated charac- 
ters. The length of the generated entity is governed by internal tad stops specifiable with ta. The 
default difference is that tabs generate motion and leaders generate a string of periods; te and Ic offer 
the choice of repeated character or motion. There are three types of internal! tab stops—/e/? adjusting, 
right adjusting, and centering. In the following table: Dis the distance from the current position on the 
input line (where a tab or leader was found) to the next tab stop; next-string consists of the input charac- 
ters following the tab (or leader) up to the next tab (or leader) or end of line; and Wis the width of 


next-string. 
Length of motion or Location of 
ia repeated es next-string 
Left Following D 


Right Right adjusted within D 
Centered Centered on right end of D 













The length of generated motion is allowed to be negative, but that of a repeated character string cannot 
be. Repeated character strings contain an integer number of characters, and any residual distance is 
prepended as motion. Tabs or leaders found after the last tab stop are ignored, but may be used as 
next-string terminators. 


Tabs and leaders are not interpreted in copy mode. \t and \a always generate a non-interpreted tab and 
leader respectively, and are equivalent to actual tabs and leaders in copy mode. 


9,2. Fields. A field is contained between a pair of field delimiter characters, and consists of sub-strings: 
separated by padding indicator characters. The field length is the distance on the input line from the 
position where the field begins to the next tab stop. The difference between the total length of all the 
sub-strings and the field length is incorporated as horizontal padding space that is divided among the 
indicated padding places. The incorporated padding is allowed to be negative. For example, if the field 
delimiter is # and the padding indicator is “, #° xx" right # specifies a right-adjusted string with the 
string »oce centered in the remaining space. 
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Request Initial If No 

Form Value Argument Notes Explanation 

ta Ne... 0.8; 0.5in none E,m Set tab stops and types. ¢=R, right adjusting, :=C, 
centering; ¢ absent, left adjusting. TROFF tab stops are 
preset every 0.Sin.; NROFF every 0.8in. The stop values 
are separated by spaces, and a value preceded by + is 
treated as an increment to the previous stop value. 

tec none none E The tab repetition character becomes c, or is removed 
specifying motion. 

Jlec , , none E The leader repetition character becomes c, or is removed 
specifying motion. 

fea b off off - The field delimiter is set to a, the padding indicator is set 


to the space character or to 8, if given. In the absence of 
arguments the field mechanism is turned off. 


10. Input and Output Conventions and Character Translations 


10.1. Input character translations. Ways of inputting the graphic character set were discussed in §2.1. 
The ASCII control characters horizontal tab (§9.1), SOH ($9.1), and backspace ($10.3) are discussed 
elsewhere. The newline delimits input lines. In addition, STX, ETX, ENQ, ACK, and BEL are accepted, 
and may be used as delimiters or translated into a graphic with tr (§10.5). Ail others are ignored. 


The escape character \ introduces escape sequences—causes the following character to mean another 
character, or to indicate some function. A complete list of such sequences is given in the Summary 
and Index on page 6. \ should not be confused with the ASCII control character ESC of the same name. 
The escape character \ can be input with the sequence \\. The escape character can be changed with 
ec, and all that has been said about the default \ becomes true for the new escape character. \e can be 
used to print whatever the current escape character is. If necessary or convenient, the escape mechan- 
ism may be turned off with eo, and restored with ec. 


Request Initial If No 

Form Value Argument Notes Explanation 

ec c \ \ - Set escape-character to \, or to c, if given. 
.e0 on - - Turn escape mechanism off. 


10.2. Ligatures. Five ligatures are available in the current TROFF character set — fi, fi, ff, fi, and ffl. 
They may be input (even in NROFF) by \(fi, \(fl, \(ff, \(Fi, and \(FI respectively. The ligature mode 
is normally on in TROFF, and automatically invokes ligatures during input. 


Request Initial If No 
Form Value Argument Notes Explanation 
Ag NV off; on on - Ligature mode is turned on if N is absent or non-zero. 


and turned off if N=0. If N=2, only the two-character 
ligatures are automatically invoked. Ligature mode is 
inhibited for request, macro, string, register, or file 
names, and in copy mode. No effect in NROFF. 


10.3. Backspacing, underlining, overstriking, etc. Unless in copy mode, the ASCII backspace character is 
replaced by a backward horizontal motion having the width of the space character. Underlining as a 
form of line-drawing is discussed in §12.4. A generalized overstriking function is described in $12.1. 


NROFF automatically underlines characters in the underline font, specifiable with uf, normally that on 
font position 2 (normally Times Italic, see §2.2). In addition to ft and \fF the underline font may be 
selected by ul and cu. Underlining is restricted to an output-device-dependent subset of reasonable 
characters. 
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Request Initial If No 
Form Value Argument Notes Explanation 


sul NV off Nw} E Underline in NROFF (italicize in TROFF) the next 
input text lines. Actually, switch to underline font, saving 
the current font for later restoration: other font changes 
within the span of a ul will take effect, but the restora- 
tion will undo the last change. Output generated by tl 
(§14) is affected by the font change, but does zo¢ decre- 
ment N. If N>1, there is the risk that a trap interpo- 
lated macro may provide text lines within the span; 
environment switching can prevent this. 


.cu NV off Nel E A variant of ul that causes every character to be under- 
lined in NROFF. Identical to ul in TROFF. 


wuf F Italic Italic - Underline font set to F. In NROFF, F may not be on 
position 1 (initially Times Roman). 


o 


10.4. Control characters. Both the control character . and the no-break control character may be 
changed, if desired. Such a change must be compatible with the design of any macros used in the span 
of the change, and particularly of any trap-invoked macros. 


Request Initial If No 

Form Value Argument Notes Explanation 

ce c . 3 E The basic contro! character is set to c, or reset to *.”. 
2c ° E The nobreak control character is set to c, or reset to”. 


10.5. Output translation. One character can be made a stand-in for another character using tr. All text 
processing (e. g. character comparisons) takes place with the input (stand-in) character which appears to 
have the width of the final character. The graphic translation occurs at the moment of output (includ- 
ing diversion). 


Request {nitial If No 
Form Value Argument Notes Explanation 
tr abcd.... none : O Translate a into 5, c into d@ etc. If an odd number of 


characters is given, the last one will be mapped into the 
space character. To be consistent, a particular transiation 
must stay in effect from input to output time. 


10.6. Transparent throughput. An input line beginning with a \! is read in copy mode and transparently 
output (without the initial \!); the text processor is otherwise unaware of the line’s presence. This 
mechanism may be used to pass control information to a post-processor or to imbed control lines in a 
macro created by a diversion. 


10.7. Comments and concealed newlines. An uncomfortably long input line that must stay one line (e. g. 
a string definition, or nofilled text) can be split into many physical lines by ending all but the last one 
with the escape \. The sequence \(newline) is aiways ignored—except in a comment. Comments may 
be imbedded at the end of any line by prefacing them with \". The newline at the end of a comment 
cannot be concealed. A line beginning with \* will appear as a blank line and behave like .sp 1; a com- 
ment can be on a line by itself by beginning the line with .\”. 


11. Local Horizontal and Vertical Motions, and the Width Function 


11.1. Local Motions. The functions \v’ N° and \h’N’ can be used for /oca/ vertical and horizontal motion 
respectively. The distance N may be negative; the positive directions are rightward and downward. A 
local motion is one contained within a line. To avoid unexpected vertical dislocations, it is necessary 
that the met vertical local motion within a word in filled text and otherwise within a line balance to zero. 


The above and certain other escape sequences providing local motion are summarized in the following 
table. 
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Effect in 
TROFF NROFF 





Move distance N \h’N’ Move distance V 
\(space) | Unpaddable space-size space 


\0 Digit-size space 
\| 1/6 em space | ignored 
Ve 1/12 em space | ignored 


As an example, E2 could be generated by the sequence E\s—2\v’—0.4m’2\v'0.4m‘\s+2; it should be 
noted in this example that the 0.4 em vertical motions are at the smaller size. 


Vertical Effect in | Horizontal 
Local Motion TROFF NROFF Local Motion 


















4 line up 
Y line down 
1 line up 


2 em up 
’% em down 
l em up 








11.2. Width Function. The width function \w’string’ generates the numerical width of string (in basic 
units). Size and font changes may be safely imbedded in string, and will not affect the current environ- 
ment. For example, .ti —\w’l. “u could be used to temporarily indent leftward a distance equal to the 
size of the string “1. *. 


The width function also sets three number registers. The registers st and sb are set respectively to the 
highest and lowest extent of string relative to the baseline; then, for example, the total Aeight of the 
string is \n(stu=\n(sbu. In TROFF the number register ct is set to a value between 0 and 3: 0 means 
that ail of the characters in string were short lower case characters without descenders (like e); 1 means 
that at least one character has a descender (like y); 2 means that at least one character is tall (like H); 
and 3 means that both tall characters and characters with descenders are present. 


11.3. Mark horizontal place. The escape sequence \kx will cause the current horizontal position in the 
input line to be stored in register x As an example, the construction \kxword\h’|\nxu+2u’ word will 
embolden word by backing up to almost its beginning and overprinting it, resulting in word. 


12. Overstrike, Bracket, Line-drawing, and Zero-width Functions 


12.1. Overstriking. Automatically centered overstriking of up to nine characters is provided by the over- 
strike function \o’ string’. The characters in string overprinted with centers aligned; the total width is 
that of the widest character. string should not contain local vertical motion. As examples, \o’e\’’ pro- 
duces é, and \o’\(mo\(sl’ produces ¢. 


12.2. Zero-width characters. The function \zc will output c without spacing over it, and can be used to 
produce left-aligned overstruck combinations. As examples, \z\(ci\(pl will produce ©, and 
\(br\z\(rn\(ul\ (br will produce the smallest possible constructed box (]. 


12.3. Large Brackets. The Special Mathematical Font contains a number of bracket construction pieces 
C(UJYELLJ[]) that can be combined into various bracket styles. The function \b’ string’ may be used 
to pile up vertically the characters in string (the first character on top and the last at the bottom); the 
characters are vertically separated by 1 em and the total pile is centered 1/2em above the current base- 


line (2 line in NROFF). For example, \b’\(le\ (If “E\|\b’ \(re\ (rf “\x’ —0.5m’\x‘0.5m’ produces [e|. 


12.4. Line drawing. The function \1°Nc’ will draw a string of repeated c’s towards the right for a dis- 
tance NV. (\I is \(lower case L). If c looks like a continuation of an expression for N, it may insulated 
from N with a \&. If cis not specified, the — (baseline rule) is used (underline character in NROFF). If 
N is negative, a backward horizontal motion of size N is made before drawing the string. Any space 
resulting from N/(size of c) having a remainder is put at the beginning (left end) of the string. In the 
case of characters that are designed to be connected such as baseline-rule _, underrule _, and root- 
en, the remainder space is covered by over-lapping. If Nis Jess than the width of c, a single c is cen- 
tered on a distance N. As an example, a macro to underscore a string can be written 


.de us 
\\$S1\ 1 °{0\ Cul’ 
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or one to draw a box around a string 


.de bx 
\(br\ |\\S1\[\ (be\ 1 [O\ Grn ’\ 1’ [0\ (ul’ 


such that 

.ul “underlined words” 
and 

-bx "words in a box" 


yield underlined words and 


The function \L’ Ne’ will draw a vertical line consisting of the (optional) character c stacked vertically 
apart lem (1 line in NROFF), with the first two characters overlapped, if necessary, to form a continu- 
ous line. The default character is the box rule | (\(br); the other suitable character is the bold vertical | 
(\(bv). The line is begun without any initial motion relative to the current base line. A positive NV 
specifies a line drawn downward and a negative N specifies a line drawn upward. After the line is drawn 
no compensating motions are made; the instantaneous baseline is at the end of the line. 





The horizontal and vertical line drawing functions may be used in combination to produce large boxes. 
The zero-width box-rule and the ‘2-em wide underrule were designed to form corners when using l-em 
vertical spacings. For example the macro 









.de eb 
sp —1 \"compensate for next automatic base-line spacing 
nf \"avoid possibly overflowing word buffer 


\h’—.5n‘\L'|\\nau—1'\I'\\n(.lu+ in\ (ul’\L’— |\\naut1'\l'[Ou—.5n\(ul’  \"draw box 
fi 





will draw a box around some text whose beginning vertical place was saved in number register a (e. g. 
using .mk a) as done for this paragraph. 


13. Hyphenation. 


The automatic hyphenation may be switched off and on. When switched on with hy, several variants 
may be set. A hyphenation indicator character may be imbedded in a word to specify desired hyphena- 
tion points, or may be prepended to suppress hyphenation. In addition, the user may specify a small 
exception word list. 


Only words that consist of a central alphabetic string surrounded by (usually null) non-alphabetic 
strings are considered candidates for automatic hyphenation. Words that were input containing hyphens 
(minus), em-dashes (\(em), or hyphenation indicator characters—such as mother-in-law—are always 
subject to splitting after those characters, whether or not automatic hyphenation is on or off. 


Request Initial If No 

Form Value Argument Notes Explanation 

nh hyphenate - E Automatic hyphenation is turned off. 

-hyN on, N=! on, V=1 E Automatic hyphenation is turned on for N21, or off for 
N=0. If N=2, last lines (ones that will cause a trap) 
are not hyphenated. For N=4 and 8, the last and first 
two characters respectively of a word are not split off. 
These values are additive; i.e. N=14 will invoke all 
three restrictions. | 

ehec \% \% E Hyphenation indicator character is set to c or to the 
default \%. The indicator does not appear in the output. 

-hw word] ... ignored - Specify hyphenation points in words with imbedded 


minus signs. Versions of a word with terminal s are 


14. Three Part Titles. 
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implied; i.e. dig—it implies dig—its.. This list is exam- 
ined initially and after each suffix stripping. The space 
available is small—about 128 characters. 


The titling function tl provides for automatic placement of three fields at the left, center, and right of a 
line with a title-length specifiable with It. tl may be used anywhere, and is independent of the normal 
text collecting process. A common use is in header and footer macros. 


Request 
Form 


tl ‘left’ center’ right’ 


pec 


lt +N 


Initial 
Value 


% 


6.5 in 


If No 
Argument 


off 


previous 


15. Output Line Numbering. 


Notes Explanation 


E,m 


The strings left, center, and right are respectively left- 
adjusted, centered, and right-adjusted in the current 
title-length. Any of the strings may be empty, and over- 
lapping is permitted. If the page-number character (ini- 
tially %) is found within any of the fields it is replaced by 
the current page number having the format assigned to 
register %. Any character may be used as the string del- 
imiter. 


The page number character is set to c, or removed. The 
page-number register remains %. 


Length of title set to =N. The line-length and the title- 
length are independent. Indents do not apply to titles; 
page-offsets do. 


Automatic sequence numbering of output lines may be requested with nm. When in effect, a 
three-digit, arabic number plus a digit-space is prepended to output text lines. The text lines are 
3 thus offset by four digit-spaces, and otherwise retain their line length; a reduction in line length 
may be desired to keep the right margin aligned with an earlier margin. Blank lines, other vertical 
spaces, and lines generated by tl are not numbered. Numbering can be temporarily suspended with 
6 nn, or with an .nm followed by a later .nm +0. In addition, a line number indent /, and the 
number-text separation S may be specified in digit-spaces. Further, it can be specified that only 
those line numbers that are multiples of some number M are to be printed (the others will appear 
9 as blank number fields). 


Request Initial 
Form Value 
am +NMSI1 
an NV - 


If No 
Argument 


off 


N=} 


Notes Explanation 


E 


E 


Line number mode. If +JN is given, line numbering is 
turned on, and the next output line numbered is num- 
bered +N. Default values are MW=1, S1, and /=0. 
Parameters corresponding to missing arguments are 
unaffected; a non-numeric argument is considered miss- 
ing. In the absence of all arguments, numbering is 
turned off; the next line number is preserved for possible 
further use in number register In. 


The next NV text output lines are not numbered. 


As an example, the paragraph portions of this section are numbered with M=3: .nm 13 was 

placed at the beginning; .nm was placed at the end of the first paragraph; and .nm +0 was placed 
12 in front of this paragraph; and .nm finally placed at the end. Line lengths were also changed (by 

\w’0000°u) to keep the right side aligned. Another example is .nm +5 5 x3 which turns on 

numbering with the line number of the next line to be 5 greater than the last numbered line, with 
15 M==5, with spacing S untouched, and with the indent /set to 3. 
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16. Conditional Acceptance of Input 


In the following, c is a one-character, built-in condition name, ! signifies not, N is a numerical expres- 
sion, String] and string2 are strings delimited by any non-bdlank, non-numeric character noc in the 
strings. and anything represents what is conditionally accepted. 


Request Initial If No 

Form Value Argument Notes Explanation 

.if c anything - - If condition c true, accept anything as input; in multi-line 
case use \{ anything\}. 

if !¢ anything - : If condition c false, accept anything. 

if N anything : u If expression N > 0, accept anything. 

wif !N anything - u If expression NV < 0, accept anything. 

if ‘string/°string2° anything - If stringl identical to string2, accept anything. 

.if ! string] ’string2° anything - If string/ not identical to string2, accept anything. 

ie ¢ anything - u If portion of if-else; all above forms (like if). 

.el anything : ° Else portion of if-else. 


The built-in condition names are: 


Condition 
Name True If 


Current page number is odd 

Current page number is even 
Formatter is TROFF 
Formatter is NROFF 






If the condition ¢ is true, or if the number N is greater than zero, or if the strings compare identically 
(including motions and character size and font), anything is accepted as input. If a ! precedes the condi- 
tion, number, or string comparison, the sense of the acceptance is reversed. 


Any spaces between the condition and the beginning of anything are skipped over. The anything can be 
either a single input line (text, macro, or whatever) or a number of input lines. In the multi-line case, 
the first line must begin with a left delimiter \{ and the last line must end with a right delimiter \}. 


The request ie (if-else) is identical to if except that the acceptance state is remembered. A subsequent 
and matching el (else) request then uses the reverse sense of that state. ie - el pairs may be nested. 


Some examples are: 
fe .t] “Even Page %°”” 
which outputs a title if the page number is even; and 
ie \n%>1 \{\ 
“sp 0.5i 
tl’ Page %"” 
’sp |1.2i \} 
.el .sp [2.53 
which treats page | differently from other pages. 
17. Environment Switching. 


A number of the parameters that control the text processing are gathered together into an environment, 
which can be switched by the user. The environment parameters are those associated with requests 
noting E in their Notes column; in addition, partially collected lines and words are in the environment. 
Everything else is global; examples are page-oriented parameters, diversion-oriented parameters, 
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number registers, and macro and string definitions. All environments are initialized with default 
parameter values. 


Request Initial If No 
Form Value Argument Notes Explanation 
ev N N=Q previous : Environment switched to environment 0< N<2. Switch- 


ing is done in push-down fashion so that restoring a pre- 
vious environment must be done with .ev rather than 
specific reference. 


18. Insertions from the Standard Input 


The input can be temporarily switched to the system standard input with rd, which will switch back 
when two newlines in a row are found (the extra blank line is not used). This mechanism is intended 
for insertions in form-letter-like documentation. On UNIX, the standard input can be the user’s key- 
board, a pipe, or a file. 


Request Initial If No 
Form Value Argument Notes Explanation 
rd prompt - prompt =BEL - Read insertion from the standard input until two new- 


lines in a row are found. If the standard input is the 
user’s keyboard, prompt (or a BEL) is written onto the 
user’s terminal. rd behaves like a macro, and arguments 
may be placed after prompt. 


eX - - : Exit from NROFF/TROFF. Text processing is terminated 
exactly as if all input had ended. 


If insertions are to be taken from the terminal keyboard while output is being printed on the terminal, 
the command line option —q will turn off the echoing of keyboard input and prompt only with BEL. 
The regular input and insertion input cannot simultaneously come from the standard input. 


As an example, multiple copies of a form letter may be prepared by entering the insertions for all the 
copies in one file to be used as the standard input, and causing the file containing the letter to reinvoke 
itself using nx (§19); the process would ultimately be ended by an ex in the insertion file. 


19. Input/Output File Switching 


Request Initial If No 

Form Value Argument Notes Explanation 

.80 filename : : Switch source file. The top input (file reading) level is 
switched to filename. The effect of an so encountered in 
a macro is not felt until the input level returns to the file 
level. When the new file ends, input is again taken from 
the original file. so’s may be nested. 

nx filename end-of-file - Next file is filename. The current file is considered 
ended, and the input is immediately switched to filename. 

-pi program : : Pipe output to program (NROFF only). This request 


must occur before any printing occurs. No arguments are 
transmitted to program. 


20. Miscellaneous 


Request Initial If No 
Form Value Argument Notes Expianation 
.mecN - off E,m = Specifies that a margin character c appear a distance N to 


the right of the right margin after each non-empty text 
line (except those produced by tl). If the output line is 
too-lor ‘~s can happen in nofill mode) the character will 
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be appended to the line. If Nis not given, the previou: 
N is used; the initial NM is 0.2 inches in NROFF and len 
in TROFF. The margin character used with this para 
graph was a 12-point box-rule. 


.tm string - newline : After skipping initial blanks, string (rest of the line) i: 
read in copy mode and written on the user’s terminal. 


ig yy - JY - Ignore input lines. ig behaves exactly like de (§7) excep 
that the input is discarded. The input is read in cop 
mode, and any auto-incremented registers will be 
affected. 


pm - all : Print macros. The names and sizes of all of the definec 
macros and strings are printed on the user’s terminal; if 
is given, only the total of the sizes is printed. The size: 
is given in dlocks of 128 characters. 


fl - - B Flush output buffer. Used in interactive debugging t& 
force output. 


21. Output and Error Messages. 


The output from tm, pm, and the prompt from rd, as well as various error messages are written ont 
UNIX’s standard message output. The latter is different from the standard output, where NROFF format 
ted output goes. By default, both are written onto the user’s terminal, but they can be independent! 
redirected. 


Various error conditions may occur during the operation of NROFF and TROFF. Certain less seriou 
errors having only local impact do not cause processing to terminate. Two examples are word overfion 
caused by a word that is too large to fit into the word buffer (in fill mode), and line overflow, caused b 
an output line that grew too large to fit in the line buffer, in both cases, a message is printed, th 
offending excess is discarded, and the affected word or line is marked at the point of truncation with a 
in NROFF and a “a in TROFF. The philosophy is to continue processing, if possible, on the ground 
that output useful for debugging may be produced. If a serious error occurs, processing terminates, an 
an appropriate message is printed. Examples are the inability to create, read, or write files, and th 
exceeding of certain internal limits that make future output unlikely to be useful. 
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TUTORIAL EXAMPLES 


Tl. Introduction 


Although NROFF and TROFF have by design a 
syntax reminiscent of earlier text processors® 
with the intent of easing their use, it is almost 
always necessary to prepare at least a small set of 
macro definitions to describe most documents. 
Such common formatting needs as page margins 
and footnotes are deliberately not built into 
NROFF and TROFF. Instead, the macro and 
string definition, number register, diversion, 
environment switching, page-position trap, and 
conditional input mechanisms provide the basis 
for user-defined implementations. 


The examples to be discussed are intended to be 
useful and somewhat realistic, but won’t neces- 
sarily cover all relevant contingencies. Explicit 
numerical parameters are used in the examples to 
make them easier to read and to illustrate typical 
values. In many cases, number registers would 
really be used to reduce the number of places 
where numerical information is kept, and to con- 
centrate conditional parameter ‘initialization like 
that which depends on whether TROFF or NROFF 
is being used. 


T2. Page Margins 


As discussed in §3, header and footer macros are 
usually defined to describe the top and bottom 
page margin areas respectively. A trap is planted 
at page position 0 for the header, and at -N (N 
from the page bottom) for the footer. The sim- 
plest such definitions might be 


.de hd \"define header 
*sp li 

es \"end definition 
.de fo \"define footer 
‘bp 

ss \"end definition 
.wh 0 hd 

wh —li fo 


which provide blank 1 inch top and bottom mar- 
gins. The header will occur on the /irst page, 
only if the definition and trap exist prior to the 


*For exampie: P. A. Crisman, Ed., The Comparible Time- 
Sharing System, MIT Press, 1965, Section AH9.01 (Descrip- 
tion of RUNOFF program on MIT’s CTSS system). 


initial pseudo-page transition (§3). In fill mode, 
the output line that springs the footer trap was 
typically forced out because some part or whole 
word didn’t fit on it. If anything in the footer 
and header that follows causes a break, that word 
or part word will be forced out. In this and other 
examples, requests like bp and sp that normally 
cause breaks are invoked using the no-break con- 
trol character to avoid this. When the 
header/footer design contains material requiring 
independent text processing, the environment 
may be switched, avoiding most interaction with 
the running text. 


A more realistic example would be 


.de hd \"header 

if t tl \(rn’\(rn’ \“troff cut mark 
if \\n% >1 \{\ 
*sp |0.5i—1 
th’— %& —” 


\"tl base at 0.5i 
\"centered page number 


.ps \"restore size 
ft \"restore font 
.vs \} \"restore vs 


*sp |1.0i \"space to 1.0i 

ns \"turn on no-space mode 
.de fo \"footer 

-ps 10 \"set footer/header size 
ft R \"set font 

.vs 12p \"set base-line spacing 
-if \\n%=1 \{\ 


‘sp |\\nCpu—0.5i—1 \"tl base 0.5i up 

th “— % —” \} \"first page number 

bp 

-wh 0 hd 

wh —1i fo 
which sets the size, font, and base-line spacing 
for the header/footer material, and ultimately 
restores them. The material in this case is a page 
number at the bottom of the first page and at the 
top of the remaining pages. If TROFF is used, a 
cut mark is drawn in the form of root-en’s at each 
margin. The sp’s refer to absolute positions to 
avoid dependence on the base-line spacing. 
Another reason for this in the footer is that the 
footer is invoked by printing a line whose vertical 
spacing swept past the trap position by possibly as 
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much as the base-line spacing. The mo-space 
mode is turned on at the end of hd to render 
ineffective accidental occurrences of sp at the top 
of the running text. 


The above method of restoring size, font, etc. 
presupposes that such requests (that set previous 
value) are moe used in the running text. A better 
scheme is save and restore both the current and 
previous values as shown for size in the follow- 
ing: 


.de fo 
nr sl \\n(.s_\"current size 
.ps 
mrs2\\n(.s_\"previous size 
— \"rest of footer 
.de hd 

ee \“header stuff 


ne \\n(s2 
-ps \\n(si 


eo 


\"restore previous size 
\"restore current size 


Page numbers may be printed in the bottom mar- 
gin by a separate macro triggered during the 
footer’s page ejection: . 


.de bn 
tl Oe % pod 


\"bottom number 
\"centered page number 


wh —0.Si—1v bn \"tl base 0.5i up 


T3. Paragraphs and Headings 


The housekeeping associated with starting a new 
paragraph should be collected in a paragraph 
macro that, for example, does the desired 
preparagraph spacing, forces the correct font, 
size, base-line spacing, and indent, checks that 
enough space remains for more than one line, and 
requests a temporary indent. 


de pg \"paragraph 
.br \"break 

ft R \"force font, 
-ps 10 \"size, 

-vs 12p \"spacing, 
.in 0 \°and indent 
Sp 0.4 \"prespace 


-ne 1+\\n(.Vu \"want more than 1 line 
1 0.21 \"temp indent 


ee 


The first break in pg will force out any previous 
partial lines, and must occur before the vs. The 
forcing of font, etc. is partly a defense against 
prior error and partly to permit things like sec- 
tion heading macros to set parameters only once. 


The prespacing parameter is suitable for TROFF, 
a larger space, at least as big as the output device 
vertical resolution, would be more suitable in 
NROFF. The choice of remaining space to test 
for in the ne is the smallest amount greater than 
one line (the .V is the available vertical resolu- 
tion). 


A macro to automatically number section head- 
ings might look like: 


.de sc \"section 

— \"force font, etc. 

sp 0.4 \*prespace 

ne 2.4+\\n(.Vu \"want 2.4+ lines 
fi . 
\\n+S. 

-nrS01 \"init S 


The usage is .sc, followed by the section heading 
text, followed by .pg. The ne test value includes 
one line of heading, 0.4 line in the following pg, 
and one line of the paragraph text. A word .con- 
sisting of the next section number and a period is 
produced to begin the heading line. The format 
of the number may be set by af (§8). 


Another common form is the labeled, indented 
paragraph, where the label protrudes left into the 
indent space. 


.de Ip \"labeled paragraph 
Pg 

-in 0.5% \"paragraph indent 
ta 0.210.5i  \"label, paragraph 

ti 0 

\t\\$1\t\e \"flow into paragraph 


The intended usage is "lp /abel"; label will begin 
at O.2inch, and cannot exceed a length of 
0.3inch without intruding into the paragraph. 
The label could be right adjusted against 0.4inch 
by setting the tabs instead with .ta 0.4iR 0.5. 
The last line of lp ends with \c so that it will 
become a part of the first line of the text that fol- 
lows. 


T4. Multiple Column Output 


The production of multiple column pages 
requires the footer macro to decide whether it 
was invoked by other than the last column, so 
that it will begin a new column rather than pro- 
duce the bottom margin. The header can initial- 
ize a column register that the footer will incre- 
ment and test. The following is arranged for two 
columns, but is easily modified for more. 


.de hd \"header 

-nrcl 01 \"init column count 
.mk \"mark top of text 

.de fo \"footer 

wie \\n + (c1< 2 \{\ 

po +3.4i \"next column; 3.1+0.3 


tt \"back to mark 


-ns \} \"no-space mode 

el \{\ 

.po \\nMu \"restore left margin 
“bp \} 


3.1 

nr M \\n(.o 
Typically a portion of the top of the first page 
contains full width text; the request for the nar- 
rower line length, as well as another .mk would 
be made where the two column output was to 
begin. 


\"column width 
\"save left margin 


TS. Footnote Processing 


The footnote mechanism to be described is used 
by imbedding the footnotes in the input text at 
the point of reference, demarcated by an initial 
-fn and a terminal .ef: 


fn 
Footnote text and control lines... 
.ef 


In the following, footnotes are processed in a 
separate environment and diverted for later 
printing in the space immediately prior to the 
bottom margin. There is provision for the case 
where the last collected footnote doesn’t com- 
pletely fit in the available space. 


.de hd \"header 

rx 01 \"init footnote count 
enry0—\\nb_ \"current footer place 
.ch fo —\\nbu \"reset footer trap 
.if\\n(dn .fz \"leftover footnote 


.de fo \"footer 

-nr dn 0 \"zero last diversion size 
.if \\nx \{\ 

ev 1 \"expand footnotes in evl 
nf \"retain vertical size 

.FN \"footnotes 

rm FN \"delete it 


wif "\\n(.z"fy" .di \"end overflow diversion 
or x 0 \"disable fx 
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.ev \} \"pop environment 

bp 

.de fx \"process footnote overflow 
if \\nx .di fy \"divert overflow 

.de fn \"start footnote 

da FN \"divert (append) footnote 
ev 1 \"in environment 1 
.if\\n+x=1 .fs \"if first, include separator 
fi \"fill mode 

de ef \"end footnote 


.br \"finish output 

nez\\n(.v— \"save spacing 

ev \"pop ev 

di \"end diversion 

nr y —\\n(dn \"new footer position, 

if \\nx=1 .nry —(\\n(Cv—\\nz) \ 
\"uncertainty correction 

ch fo\\nyu = \"y is negative 

if (\\n(nl+1lyv)> (\\n(.p+\\ny) \ 

ch fo \\n(nlu+ lv \"it didn’t fit 


.de fs 


\"separator 
\V 17’ \"1 inch rule 
-br 
de fz \"get leftover footnote 
fn 
nf \"retain vertical size 
fy \"where fx put it 
ef 


nr b 1.0i \"bottom margin size 

.wh 0 hd \"header trap 

.wh 12i fo \"footer trap, temp position 
.wh —\\nbu fx \"fx at footer position 


.ch fo —\\nbu \"conceal fx with fo 


The header hd initializes a footnote count regis- 
ter x, and sets both the current footer trap posi- 
tion register y and the footer trap itself to a nom- 
inal position specified in register b. In addition, 
if the register dn indicates a leftover footnote, fz 
is invoked to reprocess it. The footnote start 
macro fn begins a diversion (append) in environ- 
ment |, and increments the count x: if the count 
is one, the footnote separator fs is interpolated. 
The separator is kept in a separate macro to per- 
mit user redefinition. The footnote end macro ef 
restores the previous environment and ends the 
diversion after saving the spacing size in register 
z. y is then decremented by the size of the 
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footnote, available in dn; then on the first foot- 
note, y is further decremented by the difference 
in vertical base-line spacings of the two environ- 
ments, to prevent the late triggering the footer 
trap from causing the last line of the combined 
footnotes to overflow. The footer trap is then set 
to the lower (on the page) of y or the current 
page position (nl) plus one line, to allow for 
printing the reference line. If indicated by x, the 
footer fo rereads the footnotes from FN in nofill 
mode in environment 1, and deletes FN. If the 
footnotes were too large to fit, the macro fx will 
be trap-invoked to redivert the overflow into fy, 
and the register dn will later indicate to the 
header whether fy is empty. Both fo and fx are 
planted in the nominal footer trap position in an 
order that causes fx to be concealed unless the fo 
trap is moved. The footer then terminates the 
overflow diversion, if necessary, and zeros x to 
disable fx, because the uncertainty correction 
together with a not-too-late triggering of the 
footer can result in the footnote rereading finish- 
ing before reaching the fx trap. 


A good exercise for the student is to combine 
the multiple-column and footnote mechanisms. 


T6. The Last Page 


After the last input file has ended, NROFF and 
TROFF invoke the end macro (§7), if any, and 
when it finishes, eject the remainder of the page. 
During the eject, any traps encountered are pro- 
cessed normally. At the end of this last page, 
processing terminates unless a partial line, word, 
or partial word remains. If it is desired that 
another page be started, the end-macro 


.de en \"end-macro 
\c 
“bp 


em en 


will deposit a null partial word, and effect 
another last page. 
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Table I 


Font Style Examples 


The following fonts are printed in 12-point, with a vertical spacing of 14-point, and with non- 
alphanumeric characters separated by 4em space. The Special Mathematical Font was specially 
prepared for Bell Laboratories by Graphic Systems, Inc. of Hudson, New Hampshire. The Times 
Roman, Italic, and Bold are among the many standard fonts available from that company. 


Times Roman 


abcdefghijklmnopqrstuvwxyz 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 
1234567890 

$HM&()°°*+ —.,/:3=7[] | 
eo—--_4AXAANMRMAT'’¢ © 


Times Italic 


abcdefghijklmnopqrstuywxyz 
ABCDEFGHIJKLMNOPQRSTUVWX YZ 
1234567890 

IShH&OS P+ —.,/:; = 2 T]| 

C0 —-_4KLURAF HA °F ee? 


Times Bold 


abcdefghijkiImnopaqrstuywxyz 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 
1234567890 

'$H&()S°* + —.,/:;=7[]| 

ec —--_ARUNAAATRMA°T'e 2° 


Special Mathematical Font 


"\*_"/<>{}#@+——=5 
aBybeCnotxAuvéotpasTtvdxts 
TAOAEITILTY@vVO 

J >< B3~z=HF--t]lx+4z+uncrdDEad~g 
$Vafesetmrom@O lof! 
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Table II 


Input Naming Conventions for ’, ,and — 
and for Non-ASCII Special Characters 


Non-ASCII characters and minus on the standard fonts. 


Input Character Input Character 
Char Name Name Char Name Name 
4 close quote fi \(fi fi 
= open quote  \(l fl 
— \(em 3/4 Em dash ff \(F ff 
- - hyphen or fh \(Fi fi 
- \(hy hyphen fl \(Fi ffl 
- \- current font minus > \(de degree 
e \(bu duller + \(dg dagger 
© \(sq_ square ’  \(fm foot mark 


- \(ru rule ¢ \(ct cent sign 
“% \(l4 1/4 ® \(rg registered 
"ey \(l2 1/2 © \(co copyright 


Ye \(34 3/4 


Non-ASCII characters and *, °, 


The ASCII characters @, #,",°,°, <, >, \, {, }, 7, °, amd _ exist only on the speciai font and are 
printed as a l-em space if that font is not mounted. The following characters exist only on the special 
font except for the upper case Greek letter names followed by ft which are mapped into upper case 
English letters in whatever font is mounted on font position one (default Times Roman). The special 
math plus, minus, and equals are provided to insulate the appearance of equations from the choice of 
standard fonts. 


» +, —, =, and © on the special font. 


Input Character Input Character 

Char Name Name Char Name Name 

+ \(pl math plus « \(*k kappa 

~ \(mi math minus dX \C*l lambda 

= \(eq math equals wo \(*m mu 

> \(** — math star yp \(*n nu 

§ \(sc section —é \(%c xi 

*  \(aa acute accent o \(*°o omicron 

\(ga grave accent aw \(*p pi 

_ \(ul — underrule p \(*r rho 

/ \(si — slash (matching backslash) a0 \(*s_ sigma 

a \(*a_ alpha ¢  \(ts terminal sigma 

B \(*b beta tr \(%t tau 

y \(g gamma v \(*u_— upsilon 

5 \(*d = delta @ \(*f phi 

e \(*e epsilon x \@x chi 

gf \(*z zeta w \(*q _ psi 

n \(*y eta w \(*w omega 

6 \(Th theta A \(@A_ Alphaf 

© \Ci iota B \(*B. Betat 


Char 


RidSIEKOKHAMUVURAOMNALH- A-ODINME 7 


MARI DG®BSUNUNDCH + x-—JT ILRI Wa 


Input 
Name 
\CG 
\CD 
\CE 
\(CZ 
\CY 
\(H 
\CI 
\CK 
\CL 
\(*M 
\CN 
\CC 
\CO 
\CP 
\(fR 
\C*S 
\CT 
\CGU 
\(CF 
\(*X 
\(*Q 
\CW 
\(sr 
\(en 
\(>= 
\(<= 
\(= = 
\C= 
\(ap 
\(t = 
\(-> 
\(<- 
\(ua 
\(da 
\(mu 
\(di 
\(+- 
\(cu 
\(ca 
\(sb 
\(sp 
\(ib 
\(ip 
\ (if 
\(pd 
\(gr 
\(no 
\(is 
\(pt 
\(es 


\(mo 


Character 
Name 

Gamma 

Delta 

Epsilont 

Zetat 

Etat 

Theta 

lotaf 

Kappat 
Lambda 

Muf 

Nuf 

Xi 

Omicronf 

Pi 

Rhof 

Sigma 

Tauf 

Upsilon 

Phi 

Chif 

Psi 

Omega 

square root 
root en extender 
> = 

<= 

identically equal 
approx = 
approximates 
not equal 

right arrow 

left arrow 

up arrow 

down arrow 
multiply 

divide 
plus-minus 

cup (union) 

cap (intersection) 
subset of 
superset of 
improper subset 
improper superset 
infinity 

partial derivative 
gradient 

not 

integral sign 
proportional to 
empty set 
member of 


Char 
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Input Character 
Name Name 


\(or 
\(dd 
\(th 
\Qh 
\(bs 
\(or 
\(ci 
\(t 
\(b 
\(rt 
\(rb 
\(k 
\(rk 
\(bv 
\(f 


\(rf 
\(le 
\(re 


box vertical rule 

double dagger 

right hand 

left hand 

Beil System logo 

or 

circle 

left top of big curly bracket 
left bottom 

right top 

right bot 

left center of big curly bracket 
right center of big curly bracket 
bold vertical 

left floor (left bottom of big 
square bracket) 

right floor (right bottom) 

left ceiling (left top) 

right ceiling (right top) 
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Options 


-H 


-Z 


Old Requests 


.ad c 


SO name 


New Request 


.ab text 


£zFN 


Summary of Changes to N/TROFF Since October 1976 Manual 


(Nroff only) Output tabs used during horizontal spacing to speed output as well as 
reduce output byte count. Device tab settings assumed to be every 8 nominal character 
widths. The default settings of input (logical) tabs is also initialized to every 8 nominal 
character widths. 


Efficiently suppresses formatted output. Only message output will occur (from "tm"s 
and diagnostics). 


The adjustment type indicator "c" may now also be a number previously obtained from 
the "j" register (see below). 


The contents of file “name” will be interpolated at the point the "so” is encountered. 
Previously, the interpolation was done upon return to the file-reading input level. 


Prints "text” on the message output and terminates without further processing. If “text” 
is missing, “User Abort.” is printed. Does not cause a break. The output buffer is 
flushed. 


forces font "F” to be in size N. N may have the form N, +N, or-N. For example. 

£z 3-2 ; 
will cause an implicit \s-2 every time font 3 is entered, and a corresponding \s+2 when 
it is left. Special font characters occurring during the reign of font F will have the same 
size modification. If special characters are to be treated differently, 

fzSFN 
may be used to specify the size treatment of special characters during font F. For 
example, 

£z 3 -3 

£z§ 3-0 
will cause automatic reduction of font 3 by 3 points while the special characters would 
not be affected. Any ‘‘.fp’’ request specifying a font on some position must precede 
‘* fz’’ requests relating to that position. 


New Predefined Number Registers. 


.K 


Read-only. Contains the horizontal size of the text portion (without indent) of the 
current partially collected output line, if any, in the current environment. 


Read-only. A number representing the current adjustment mode and type. Can be 
saved and later given to the "ad” request to restore a previous mode. 


Read-only. | if the current page is being printed, and zero otherwise. 
Read-only. Contains the current line-spacing parameter ("Is"). 


General register access to the input line-number in the current input file. Contains the 
same value as the read-only “c” register. 
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A TROFF Tutorial 


Brian W. Kernighan 


Bell Laboratories 
Murray Hill, New Jersey 07974 


1. Introduction 


troff (1] is a text-formatting program. writ- 
ten by J. F. Ossanna, for producing high-quality 
printed output from the phototypesetter on the 
UNIX and GCOS operating systems. This docu- 
ment is an example of troff output. 


The single most important rule of using 
troff is not to use it directly, but through some 
intermediary. In many ways, troff resembles an 
assembly language — a remarkably powerful and 
flexible one — but nonetheless such that many 
operations must be specified at a level of detail 
and in a form that is too hard for most people to 
use effectively. 


Fortwo special applications, there are pro- 
grams that provide an interface to troff for the 
majority of users. eqn (2] provides an easy to 
learn language for typesetting mathematics: the 
eqn user need know no troff whatsoever ‘o 
typeset mathematics. tbi (3) provides the same 
convenience for producing tables of arbitrary 
complexity. 


For producing straight text (which may 
weil contain mathematics or tables). there are a 
number of ‘macro packages’ that define format- 
ting rules and operations for specific styles of 
documents, and reduce the amount of direct 
contact with troff. In particular, the ‘—ms" (4] 
and PWB/MM (5] packages for Beil Labs inter- 
nal memoranda and external papers provide most 
of the facilities needed for a wide range of docu- 
ment preparation. (This memo was prepared 
with ‘—ms’.) There are also packages for view- 
graphs. for simulating the older roff formatters 
on UNIX and GCOS, and for other special applica- 
tions. Typically you will find these packages 
easier (o use than troff once you get beyond the 
most trivial operations, you should always cone 
sider them first. 


In the few cases where existing packages 
don’t do the whole job, the solution is sor to 
write an entirety new set ot troff instructions 
rom scratch, but to make small changes to adapt 
packages that already exist. 


In accordance with this philosophy of let- 
ting someone eise do the work, the part of troff 
described here is only a small part of the whole. 
although it tries (o concentrate on the more use- 
ful parts. In any case, there is no attempt to be 
complete. Rather, the emphasis is on showing 
how to do simple things. and how to make incze- 
mental changes to what already exists. The con- 
tents of the remaining sections are: 


Introduction to macros 

. Titles. pages and numbering 

10. Number registers and arithmetic 
ll. Macros with arguments 

12. Conditionals 

13. Environments 

14. Diversions 

Agpendix: Typesetter character set 


2. Point sizes and line spacing 

3. Fonts and special characters 

4. Indents and line length 

5. Tabs 

6. Local motions: Drawing lines and characters 
7. Strings 

8. 

9 


The troff described here is the C-language ver- 
sion running on UNIX at Murray Hill, as docu- 
mented in [1]. 


To use tro you have to prepare not only 
the actual text you want printed. but some infor- 
mation that tells how you want it printed. 
(Readers who use roff will find the 2oproach 
familiar.) For troff the text and the formatting 
information are often intertwined quite inti- 
mately. Most commands to troff ace piaced’on a 
line separate from the text itself, beginning with 
a period (one command per line). For example. 


Some text. 
ps 14 
Some more text. 


wil change the poine size. inact is, the size or 
the letters being printed. to “14 point’ (one point 
is 1/72 ineh) like this: 
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Some text. SOMe more text. 


Occasionaily, though, something special 
occurs in the middle of a line — to produce 


Area = wre 


you have to type | 
Area = \(ep\fIr\fR\{\s8\u2\d\s0 


(which we will explain shortly). The backslash 
character \ is used to introduce troff commands 
and special characters within a line of text. 


2. Point Sizes; Line Spacing 


As mentioned above, the command .ps 
sets the point size. One point is 1/72 inch, so 
6-point characters are at most 1/12 inch high, 
and 36-point characters are ‘2 inch. There are 15 
point sizes, listed below. 

6 porn: Pack my box wih five dozen liquor jugs. 

? point: Pack my box with five dozen liquor jugs. 

8 point: Pack my box with five dozen liquor jugs. 

9 point: Pack my box with five dozen liquor jugs. 
10 point: Pack my box with five dozen liquor 
11 point: Pack my box with five dozen 

12 point: Pack my box with five dozen 


14 point: Pack my box with five 
16 point 18 point 20 point 


22 24 28 36 


If the number after .ps is not one of these 
legal sizes, it is rounded up to the next valid 
value, with a maximum of 36. If no number fol- 
lows .ps, troff reverts to the previous size, what- 
ever it was. tro begins with point size 10, 
which is usually fine. This document is in 9 
point 

The point size can also be changed in the 
middie of a line or even a word with the in-line 
command \s. To produce 


UNIX runs on a POP-11/45 
type 
\s8UNIX\s10 runs on a \s8PDP-\s1011/45 


As above, \s should be followed by a legal point 
size, except that \sO causes the size to revert to 
its previous value. Notice that \s1011 can be 
understood correctiy as size 10, foiloweu by au 
Ll", if the size is legal, but not otherwise. Be 
cauuous with similar constructions. 


Relative size changes are also legal and 
useful: 


\s~2UNIX\s+2 


temporarily decreases the size, whatever it is, by 
two points, then restores it. Relative size 
changes have the advantage that the size 
difference is independent of the starting size of 
the document. The amount of the relative 
change is restricted to a single digit. 


The other paramerer that determines what 
the type looks like is the spacing between lines. 
which is set independently of the point size. 
Vertical spacing is measured from the bottom of 
one line to the bottom of the next. The com- 
mand to control vertical spacing is .vs. For run- 
ning text. it is usually best to set the vertical 
spacing about 20% bigger than the character size. 
For example, so far in this document, we have 
used “9 on 11°°, that is, 


ps9 
vs IIp 


If we changed to 


ps9 

Vs 9p 
the cunning text would look like this. After a 
few lines, you will agree it looks a little cramped. 
The right vertical spacing is partly a matter of 
taste, depending on how much text you want to 
of pas into a given space, and partly a matter 


traditional printing style. By default, troff 
uses 10 on 12. 


Point size and vertical spacing 
make a substantial difference in the 


amount of text per square inch. 
This is 12 on 14. 


Pow suze and verucel specing make a subsianual difference ia 
the amount of text per square incn. For example. 10 on 12 uses about 
twee ag much spece as 7 on 8. Thea ts 6 0n 7, which 1s even smailer. {c 
packs a iat more words per line. but you can go bind Irying io read 1. 

When used without arguments, .ps and .vs 
revert to the previous size and vertical spacing 
respectively. 


The command .sp is used to get extra vert- 
ical space. Unadorned, it gives you one extra 
blank line (one :vs, whatever that has been set 
to). Typically, that’s more or less than you 
want, so .sp can be followed by information 
about how much space you want — 


SP 2i 

means ‘two inches of vertical space’. 
sp 2p 

means ‘two points of vertical space’: and 
sp 2 


means ‘two vertical spaces’ — two of whatever 


.vs is set to (this can also be made explicit with 
sp 2v): troff also understands decimal fractions 
in most places. so 


sp 1.Si 


is a space of 1.5 inches. These same scale fac- 
tors can be used after .vs to define line spacing, 
and in fact after most commands that deal with 
physical dimensions. 


It should be noted that all size numbers 
are converted internally to ‘machine units’, 
which are 1/432 inch (1/6 point). For most pur- 
poses, this is enough resolution that you don’t 
have to worry abouc the accuracy of the 
representation. The situation is not quite so 
good vertically, where resolution is 1/144 inch 
(1/2 point). 


3. Fonts and Special Characters 


troff and the typesetter allow four different 
fonts at any one time. Normally three fonts 
(Times roman, italic and bold) and one collec- 
tion of special characters are permanently 
mounted. 


abedefghijkimnoparstuvwxyz 0123456789 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 
abcde/ghijkimnoparstuvwxyz 0123456739 
ABCDEFGHISKLUNOPQRSTUVWXYZ 
abcdefghijkimnopaqrsturwxyz 01234567389 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 


The greek, mathematical symbols and miscellany 
of the special font are listed in Appendix A. 


trof prints in coman unless told otherwise. 
To switch into bold, use the .ft command 


ft B 
and for italics, 
fl 


To return to roman, use ft R: to return to the 
previous font, whatever it was, use either .ft P or 
just (t The ‘underline’ command 


ul 


causes the next input line to print in italics. .ul 
can be followed by a count to indicate that more 
than one line is to be italicized. 


Fonts can also be changed within a line or 
word with the in-line command \f: 


boldface text 
is produced by 
\{Bbold\fIface\fR text 


If you want to do this so the previous font, 
whatever it was, is left undisturbed, insert extra 
\fP commands, like this: 
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\fBbold\iP\ fTface\ PUR text\{P 


Because only the immediately previous font is 
remembered, you have to restore the previous 
font after each change or vou can lose it. The 
same is true of .ps and .vs when used without an 
argument. 


There are other fonts available besides the 
standard set. although you can still use only four 
at any given time. The command -fp tells troff 
what fonts are physically mounted on the 
typesetter: 


fp3H 


says that the Helvetica font is mounted on posi- 
tion 3. (For a complete list of fonts and what 
they look like, see the troff manual.) Appropriate 
£p commands should appear at the beginning of 
your document if you do not use the standard 
fonts. 


It is possible to make a document reia- 
tively independent of the actual fonts used to 
print it by using font numbers instead of aames: 
for exampie, \{3 and .£t°3 mean ‘whatever font 
is mounted at position 3°, and thus work for any 
setting. Normal settings are roman font on |, 
italic on 2, bold on 3, and special on 4. 


There is also a way to get ‘synthetic’ bold 
fonts by oversiriking letters with a slight offset. 
Look at the .bd command in [1]. 


Special characters have four-character 
names beginning with \(. and they may be 
inserted anywhere. For example, 


heh my 
is produced by 
\(Q1S #\CL2 = \(34 


In particular, greek letters are all of the form 
\(em, where = is an upper or lower case roman 
letter reminiscent of the greek. Thus to get 


L(axpB) — 2 
in bare troff we have to type 
\(eS Q\ (eal (mul (eb) \(— > \ GF 
That line is unscrambled as follows: 
\(eS 
( 
\(sa 
\(mu 
\(sb 
) 
\(-> 
\GE 
A complete list of these special names occurs in 
Appendix A. 


g,r~Rxe am 
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In eqn [2] the same effect can be achieved 
with the input 


SIGMA ( alpha times beta ) —> inf 


which is less concise, but clearer to the unini- 
dated. 

Notice that each four-character name is a 
single character as far as troff is concerned — the 
‘translate’ command 


tr \(mi\ (em 

is perfectly clear, meaning 
com 

that is, to translate = into =. 


Some characters are automatically 
translated into others: grave and acute ~ 
accents (apostrophes) become open and close 
single quotes **; the combination of **..."° is gen- 
erally preferable to the double quotes ”...°. Simi- 
larly a typed minus sign becomes a hyphen -. To 
print an explicit — sign, use \e. To get a 
backslash printed, use \e. 


4. Indents and Line Lengths 


troff starts with a line length of 6.5 inches, 
too wide for 8x11 paper. To reset the line 
length, use the .lJ command, as in 


ll 63 


As with .sp, the actual length’can be specified in 
several ways; inches are probably the most intui- 
tive. 

The maximum line length provided by the 
typesetter is 7.5 inches, by the way. To use the 
full width, you will have to reset the default phy- 
sical left margin (‘‘page offset’*), which is nor- 
mally slightly less than one inch from the left 
edge of the paper. This is done by the .po com- 
mand. 


-po 0 
sets the offset as far to the left as it will go. 


The indent command .in causes the left 
margin to be indented by some specified amount 
from the page offset. If we use .in to move the 
left margin in. and .l] to move the right margin 
to the left, we can make offset biocks of text: 

.in 0.33 

lt =O. 33 

text to be set into a block 

AM +0.3i 

in 0.31 


will create a block that looks like this: 


Pater moster qui est in caelis 
sanctificetus nomen tuum: adveniat 
regnum tuum: fiat voluntas tua, sicut 
in caelo, et in terra. ... Amen. 


Notice the use of ‘+* and ‘=" to specify the 
amount of change. These change the previous 
setting by the specified amount, rather than just 
overriding it. The distinction is quite important: 
ll +1i makes lines one inch longer: II li makes 
them one inch jong. 


With .in. Ul and .po, the previous value is 
used if no argument is specified. 


To indent a single line, use the ‘temporary 
indent’ command .ti. For example, all paragraphs 
in this memo effectively begin with the com- 
mand 

3 
Three of what? The default unit for .ti, as for 
most horizontally oriented commands (il. .in. 
.po), is ems; an em is roughly the width of the 
letter ‘m‘ in the current point size. (Precisely, a 
em in size p is p points.) Although inches are 
usually clearer than ems to peopie who don’t set 
type for a living, ems have a place: they are a 
measure of size that. is proportional to the 
current point size. If you want to make text that 
keeps its proportions regardless of point size, you 
should use ems for ail dimensions. Ems can be 
specified as scale factors directly, as in . 2.5m. 


Lines can also be indented negatively if the 
indent is already positive: 


i 0.3 i 


causes the next line to be moved back three 
tenths of an inch. Thus to make a decorative 
initial capital, we indent the whole paragraph, 
then move the letter °P” back with a .ti com- 
mand: 


ater noster qui est in caelis 
sanctificetur nomen tuum. ad- 
veniat regnum tuum: fiat volun- 


lag tua, sicut in caelo, et in terra. ... 
Amen. 


Of course, there is. also some trickery to make 
the °P” bigger (just a ‘\s36P\s0’), and to move it 
down from its normal position (see the section 
on local motions). 


5. Tabs 


Tabs (the ASCII ‘horizontal tab’ character) 
can be used to produce output in columns, or ‘o 
set the horizontal position of outpus. Typically 
tabs are used only in unfilled text. Tab stops are 
set by default every haif inch from the current 
indent, but can be changed by the .ta command. 
To set stops every inch. for example. 


ta Li 2i Si 44 Si 63 


Unfortunately the stops are left-justified 
only (as an a typewriter), so lining up columns 
of right-justified numbers can be paintul. [f you 
have many numbers. or if you need more com- 
plicated table layout, don’t use troff directly, use 
the tbl program described in (3). 


For a handful of numeric columns, you 
can do it this way: Precede every number by 
enough blanks to make it line up when typed. 


ant 
ta li 2i 3i 

1 ab 62 wb 3 
40 wad 50 wb 60 
700 sab 800 tab 900 
i 


Then change each leading blank into the string 
\0. This is a character that does not print, but 
that has the same width as a digit. When 
printed, this will produces 


l 2 3 
40 50 60 
700 .800 900 


It is also possible to fill up tabbed-over 
space with some character other than blanks by 
setting the ‘tab replacement character’ with the 
te command: 


ta 1.5% 235i 
te \(ru ( (eu is °.") 
Name ab Age mb 


produces 


Name Age 








To reset the tab replacement character to a 
blank, use .tc with no argument. (Lines can also 
be drawn wich the \I command, described in Sec- 
tion 6.) 


troff also provides a very general mechan- 
ism called ‘fields’ for setting up complicated 
columns. (This is used by tbl). We will not go 
into it in this paper. 


6. Local Motions: Drawing lines and charac- 
ters 


Remember ‘Area = we’ and the big *P’ 
in the Paternoster. How are they done? troff 
provides a host of commands for placing charac- 
ters of any size at any place. You can use them 
to draw special characters or to tune your output 
for a particular appearance. Most of these com- 
mands are straightforward, but messy to read 
and tough to type correctly. 


{f you won't use eqn. subscripts and super- 
scripts are most easily done with the hailf-line 
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local motions \u and \d. To go back up the page 
haifa point-size. insert a \u at the desired place: 
to go down, insert a \d. (\u and \d should always 
be used in pairs. as explained below.) Thus. 


Area = \Cepe\u2\d 
produces 
Area = wre 


To make the ‘2° smaller. bracket it with 
\s—2...\s0. Since \u and \d refer to the current 
point size, be sure to put them either both inside 
or both outside the size changes, or you will get 
an unbalanced vertical motion. 


Sometimes the space given by \u and \d 
isn’t the right amount. The \v command can be 
used to request an arbitrary amount of vertical 
motion. The in-line command 


\v’ (amount)’ 


causes motion up or down the page by the 
amount specified in ‘(amount)’. For example. to 
move the ‘P* down, we used 


in +0.6i (move paragraph in) 
ll 0.33 (shorten lines) 
0.31 (move P back) 


\v'2\s36P\s0\v" = 2’ater noster qui est 
in caelis ... 


A minus sign causes upward motion, while no 
sign or a plus sign means down the page. Thus 
\v'=2' causes an upward vertical motion of two 
line spaces. 


There are many other ways to specify the 
amount of motion — 

\v'0. 17 

\v'3p’ 

\v'—0.5m’ 
and so on are all legal. Notice that the scale 
specifier i or p or m goes inside the quotes. Any 
character can be used in place of the quotes: this 


is also true of all other trof&f commands described 
in this section. 


Since troff does not take within-the-line 
vertical motions into account when figuring out 
where it is on the page, output, lines can have 
unexpected positions ‘if the left and right ends 
aren't at the same vertical position. Thus \v, 
like \u and \d, should always balance upward 
vertical motion in a line with the same amount 
in the downward direction. 

Arbitrary horizontal motions ace alsa wae 
abie — \h is quite analogous to \v. except that 
the default scale factor is ems instead of line 
spaces. As an example, 


\h 0.17 
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causes a backwards motion of a tenth of an inch. 
AS a practical matter, consider printing the 
mathematical symbol °*>>°*. The default spacing 
is too wide, so eqn replaces this by 


>\h'—0.3m’> 
to produce >>. 


Frequently \b is used with the ‘width func- 
tion’ \w to generate motions equal to the width 
of some character string. The construction 


\w'thing’ 
is a number equal to the width of ‘thing’ in 
machine units (1/432 inch). All tro& computa- 
tions are ultimately done in these units. To 


move horizontally the width of aa ‘x’, we can 
say 

\h’\w’x’u’ 
As we mentioned above, the default scale factor 
for ail horizontal dimensions is m, ems, so here 
we must have the u for machine units, or the 
motion produced will be far too large. troff is 


quite happy with the nested quotes, by the way, 
so long as you don’t leave any out. 


As a live example of this kind of construc- 
tion, all of the command names in the text, like 
sp, were done by overstriking with a slight 
offset. The commands for .sp are 


sp\h’ —\w’.sp'u\h'lu’.sp 
That is, put out ‘sp’, move left by the width of 
*.sp’, move right | unit, and print ‘sp’ again. 
(Of course there is a way to avoid typing that 


much input for each command name, which we 
will discuss in Section 11.) 


There are also several special-purpose trof 
commands for local motion. We have already 
seen \0, which is an unpaddable white space of 
the same width as a digit. ‘Unpaddable’ means 
that it will never be widened or split across a line 
by line justification and filling. There is also 
\(biank). which is an unpaddable character the 
width of a space, \| which is half that width, \*, 
which is one quarter of the width of a space, and 
\&. which has zero width. (This last one is use- 
ful, for example, in entering a text line which 
would otherwise begin with a °.”.) 


The command \o, used like 
\o’set of characters’ 


causes (up to 9) characters to be overstruck, cen- 
tered on the widest. This is nice for accents, as 
in 


syst\o"e\ (ga"me t\o"e\ (aa“t\o"e\ (aa*phonique 
which maxes 


sysiéme téléphonique 


The accents are \(ga and \(aa, or \° and \’. 
remember that each is just one character to troff. 


You can make your own overstrikes with 
another special convention, \z, the zero-motion 
command. \zx suppresses the normal horizontal 
motion after printing the single character x, so 
another character can be laid on top of it. 
Although sizes can be changed within \o. ic 
centers the characters on the widest, and there 
can be no horizontal or vertical motions, so \z 
may be the only way to get what you want: 


En 


is produced by 


cna (sq\s14\z\ (sq\s22\z\ (sq\s36\ (sq 
The .sp is needed to leave room for the result. 
As another example, an extra-heavy semi- 
colon that looks like 
3 instead of ; or 4 
can be constructed with a big comma and a big 
period above it: 
\s#6\z.\v'=—0.25m’.\v'0.25m1s0 
*0.25m° is an empirical constant. 


A more ornate overstrike is given by the 
bracketing function \b, which piles up characters 
vertically, centered on the current baseline. 
Thus we can get big brackets, constructing them 
with piled-up smaller pieces: 


C+] 


by typing in only this: 


SD 
VOA tN Ik\ (Ib’ \b\ (iAP x \b\(re\ (rf \b\ (eth (rk\ (eb’ 


tro also provides a convenient facility for 
drawing horizontal and vertical lines of arbitrary 
length with arbitrary characters. \I'li’ draws a 
line one inch long, like this; Wl. 


‘The length can be followed by the character to 


use if the — isn’t appropriate; \I'0.Si.’ draws a 
half-inch line of dots: ..........c00 The construc: 
tion \L is entirely analogous, except that it draws 
a vertical line instead of horizontal. 


7. Strings 


Obviously if a paper contains a large 
number of occurrences of an acute accent over 3 
letter ‘e’, typing \o*e\™ for each &@ would be a 


great nuisance. 


Fortunately. troff provides a2 way in which 
you can store an arbitrary collection of text in a 
*string’. and thereafter use the string name as a 
shorthand for its contents. Strings are one of 
several tro mechanisms whose judicious use 
lets you type a document with less effort and 
organize it so that extensive format changes can 
be made with few editing changes. 


A reference to a string is replaced by what- 
ever text the string was defined as. Strings are 
defined with the command .ds. The line 


.ds e \o"e\” 
defines the string e to have the vaiue \o"e\* 


String names may be either one or two 
characters tong, and are referred to by \ex for 
one character names or \e(xy for two character 
names. Thus to get téléphone, given the 
definition of the string e as above, we can say 
t\ee!\*ephone. 


If a string must begin with blanks, define it 
as 


-ds xx ° text 


The double quote signals the beginning of the 
definition. There is no trailing quote: the end of 
the line terminates the string. 


A string may actually be several lines long: 
if troff encounters a \ at the end of any line, it is 
thrown away and the next line added to the 
current one. So you can make a long string sim- 
ply by ending each line but the last with ‘a 
backslash: 


.ds xx this \ 
is a very \ 
long string 


Strings may be defined in terms of other 
Strings, or even in terms of themselves; we will 
discuss some of these possibilities later. 


8. Introduction to Macros 


Before we can go much further in troff, we 
need to learn a bit about the macro facility. In 
its simplest form, a macro is just a shorthand 
notation quite similar to a string. Suppose we 
want every paragraph to start in exactly the same 
way — with a space and 2 temporary indent of 
two ems: 


SP 
i #2m 


Then io save typing, we would like to collapse 
these into one shorthand line. a troff ‘command’ 
like 
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.PP 
that would be treated by troff exactiy as 


Sp 
i &2m 


PP is called a macro. The way we tell troff what 
.PP means is to define it with the .de command: 


.de PP 


Sp 
i 2m 


The first line names the macro (we used *.PP’ 
for ‘paragraph’. and upper case so it wouldn't 
conflict with any name that troff might already 
know about). The last line .. marks the end of 
the definition. In between is the text, which is 
simply inserted whenever troff sees the ‘com- 
mand‘ of macro call 


PP 


A macro can contain any mixture of text and 
formatting commands. 


The definition of .PP has to precede its 
first use; undefined macros are simply ignored. 
Names are restricted to one or two characters. 


Using macros for commonly occurring 
sequences of commands is critically imporcant. 
Not only does it save typing, but it makes later 
changes much easier. Suppose we decide that 
the paragraph indent is too smail. the vertical 
space is much too big, and roman font should be 
forced. Instead of changing the whole docu- 
ment, we need only change the definition of .PP 
to something like 


.de PP \° paragraph macro 
-sp 2p 

i e3m 

ft R 


° 


and the change takes effect everywhere we used 
FP: 


\" is a troff command that causes the rest 
of the line to be ignored. We use it here to add 
comments to the macro definition (a wise idea 
once definitions get complicated). 


As another example of macros, consider 
these two which start and end a block of offset. 
unfilled text. like most of the examples in this 
paper: 
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.de BS \" start indented block 
Sp 

nf 

in +0.3i 


.de BE \" end indented block 
Sp 

Ai 

in —0.3i 


Now we can surround text like 


Copy to 

John Doe 
Richard Roberts 
Stanley Smith 


by the commands .BS and .BE, and it will come 
out as it did above. Notice that we indented by 
in +0.3i instead of .in 0.31. This way we can 
nest our uses of .BS and BE to get blocks within 
blocks. 


If later on we decide that the indent should 


be 0.5i, then it is only necessary to change the 
definitions of .BS and .BE, not the whole paper. 


9. Titles, Pages and Numbering 


This is an area where things get tougher, 
because nothing is done for you automatically. 
Of necessity, some of this section is a cookbook, 
to be copied literally until you get some experi- 
ence. 


Suppose you want a title at the top of each 
page, saying just 
"left top 


In roff, one can say 


center top right top 


-he ‘left top’center top’right top’ 
fa ‘left bottom’center bottom right bottom’ 


to get headers and footers automatically on every 
page. Alas, this doesn’t work in troff, a serious 
hardship for the novice. Instead you have to do 
a lot of specification. 


You have to say what the actual tide is 
(easy); when to print it (easy enough); and what 
to do at and around the title line (harder). Tak- 
ing these in reverse order, first we define a 
macro .NP (for ‘new page’) to process titles and 
the like at the end of one page and the beginning 
of the next: 

.de NE 

bp 

‘sp 0.51 

.U “left top’center top’right top’ 

‘sp 0.33 


To make sure we're at the top of a page. we 


issue a ‘begin page’ command ‘dp. which causes 
a skip to top-of-page (we'll explain the ‘ shortly). 
Then we space down half an inch, print the title 
(the use of .th should be seif explanatory: later 
we will discuss parameterizing the titles), space 
another 0.3 inches, and we're done. 


To ask for .NP at the bottom of each page. 
we have to say something like ‘when the text is 
within an inch of the bottom of the page, start 


the processing for a new page.” This is done with 


a ‘when* command .wh: 
wh li NP 


(No ‘." is used before NP: this is simply the 
name of a macro, not a macro call.) The minus 
sign means ‘measure up from the bottom of the 
page’, so ‘li’ means ‘one inch from the bot- 
tom’. 

The .wh command appears in the input 
outside the definition of .NP. typically the input 
would be 


.de NP 


‘wh —1iNP 


Now what happens? As text is actually 
being output, troff keeps track of its vertical 
position on the page, and after a line is printed 
within one inch from the bottom, the .NP macro 
is activated. (In the jargon, the .wh command 
sets a map at the specified piace. which is 
‘sprung’ when that point is passed.) .NP causes 2 
skip to. the top of the next page (that’s what the 
‘bp was for), then prints the tide with the 
appropriate margins. 


Why ‘dbp and ‘sp instead of .bp and .sp? 
The answer is that .sp and .bp, like several other 
commands, cause a break to take place. That is. 
all the input text collected but not yet printed is 
flushed out as soon as possible. and the next 
input line is guaranteed to start a new fine of 
output If we had used .sp or .bp in the .NP 
macro, this would cause a break in the middle of 
the current output line when a new page is 
started. The effect would be to print the left- 
over part of that line at the top of the page. fol- 
lowed by the next input line on a new output 
line. This is roe what we want Using ‘ instead 
of . for a command tells troff that no break is to 
take place — the output line currently deing, 
filled should nor be forced out before the space 
or new page. 


The list of commands that cause a break is 
short and natural: 


-bp .br .ce fi .nf .sp iin ti 


AM ethers cause ro break. regardless of whether 


you use 2. ora’. [f you really need a break, add 
a .br command at the appropriate place. 


One other thing to beware of — if you're 
changing fonts or point sizes a lot, you may find 
that if you cross a page boundary in an unex- 
pected font or size. your titles come out in that 
size and font instead of what you intended. 
Furthermore, the length of a title is independent 
of the current line length, so titles will come out 
at the default length of 6.5 inches. unless you 
change it, which is done with the ./t command. 


There are several ways to fix the problems 
of point sizes and fonts in titles. For the sim- 
plest applications, we can change .NP to set the 
proper size and font for the title, then restore 
the previous values. like this: 


.de NP 

bp 

‘sp 0.5i 

fR \° set title font to roman 
ps 10 \" and size to 10 point 
elt 6i \ and length to 6 inches 
oll “left'center‘right’ 

-ps \" revert to previous size 
ft P \" and to previous font 
‘sp 0.33 


This version of .NP does nor work if the 
fields in the .tf command contain size or font 
changes. To cope with that requires troff’s 
‘environment’ mechanism, which we will discuss 
in Section 13. 


To get a foocee at the bottom of a page. 
you can modify NP so it does some processing 
before the ‘bp command, or split the job into a 
footer macro invoked at the bottom margin and 
a header macro invoked at the top of the page. 
These variations are left as exercises. 


Output page numbers are computed 
automatically as each page is produced (starting 
at 1), but no numbers are printed unless you ask 
for them explicitly. To get page numbers 
printed, inctude the character % in the .u line at 
the position where you want the number to 
appear. For example 


Ul 2 % 2” 


centers the page number inside hyphens, as on 
this page. You can set the page number at any 
time with either .bp a, which immediately starts 
2 new page numbered n. of with .pn na. which 
sets the page number for the next page but 
doesn’t cause a skip to the new page. Again. 
.bp +n sets the page number to n more than its 
current value, .bp means .bp +1. 
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10. Number Registers and Arithmetic 


troff has a facility for doing arithmetic. and 
for defining and using variables with numeric 
values, called aumber registers. Number regis- 
ters, like strings and macros. can be useful in 
setting up a document so it is easy io change 
later. And of course they serve for any sort of 
arithmetic computation. 


Like strings, number registers have one or 
two character names. They are set by the .nar 
command, and are referenced anywhere by \nx 
(one character name) or \n(xy (two character 
name). 


There are quite a few pre-detined number 
registers maintained by troff, among them % for 
the current page number, al for the current vert- 
ical position on the page: dy, mo and yr for the 
current day, month and year; and .s and f for 
the current size and fone. (The font is 2 number 
from | to 4.) Any of these can be used in com- 
putations like any other register, but some, like 
3 and £, cannot be changed with .ar. 


As an example of the use of number regis- 
ters, in the —ms macro package [4], most 
significant parameters are defined in terms of the 
values of a handful of number registers. These 
include the point size for text, the vertical spac- 
img, and the line and title lengths. To set the 
point size and vertical spacing for the following 
paragraphs, for example, a user may say 


nr PS 9 
ear VS il 


The paragraph macro .PP is defined (roughly) as 
follows: 


.de PP 

-ps \\n(PS \" reset size 
.vs \\n(VSo \" spacing 
eR \" font 

sp 0.5v \" half a line 
u +3m 


This sets the font to Roman and the point size 
and line spacing to whatever values are stored in 
the number registers PS and VS. 


Why are there two backslashes? This is 
the eternal problem of how to quote a quote. 
When troff originally reads the macro definition, 
it peeis off one backslash to see what's coming 
next. To ensure that another is left in the 
definition when the macro is used, we have to 
Dut in two backslashes in the definition. [f only 
one backslash is used, point size and vertical 
spacing will be frozen at the time the macro is 
defined, not when it is used. 


Protecting by an extra layer of backslashes 
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is only needed for \n. \e, \S (which we haven't 
come to yet), and \ itself. Things like \s. \f. \h. 
\v, and so on do not need an extra backslash, 
since they are converted by troff to an internal 
code immediately upon being seen. 


Arithmetic expressions can appear any- 
where that a number is expected. As a trivial 
example, 


-nr PS \\n(PS—2 


decrements PS by 2. Expressions can use the 
arithmetic operators +, —, *, /, % (mod), the 
relational operators >, >. <, <™, ™, and 
'=s (not equal), and parentheses. 


Although the arithmetic we have done so 
far has been straightforward. more complicated 
things are somewhat tricky. First, number regis- 
ters hold only integers. trof arithmetic uses 
wuncating integer division, just like Fortran. 
Second, in the absence of parentheses, evalua- 
tion is done left-to-right without any operator 
precedence (inctuding relational operators). 
Thus 


70 4+3/13 


becomes ‘= 1°. Number registers can occur any- 
where in an expression, and so can scale indica- 
tors like p, i, m, and so on (but no spaces). 
Although integer division causes truncation, each 
number and its, scale indicator is converted to 
machine units (1/432 inch) before any arithmetic 
is done, so li/2u evaluates to 0.Si correctly. 


The scale indicator u often has to appear 
when you wouldn't expect it — in particular. 
when arithmetic is being done in a context that 
implies horizontal or vertical dimensions. For 
example, 


A 7/2i 


would seem obvious. enough — 3 inches. 
Sorry. Remember that the default units for hor- 
izontal parameters like J are ems. That's really 
‘7 ems / 2 inches’, and when transiated into 
machine units, it becomes zero. How about 


 7i/2 


Sorry, still no good — the ‘2° is ‘2 ems’, so 
‘7i/2° is small, although not zero. You mrusruse 


ll 7i/2u 


So again, a safe rule is to attach a scale indicator 
to every number, even constants. 


For arithmetic done within a .ar command, 
there is no implication of horizontal or vertical 
dimension. so the default units are ‘units’, and 
7i/2 and 7i/2u mean the same thing. Thus 


.ar ll 7i/2 
A \\adlu 


does just what you want. so long as you don't 
forget the u on the Il command. 


11. Macros with arguments 


The next step is to define macros that can 
change from one use to the next according to 
parameters supplied as arguments. To make this 
work, we need two things: first, when we define 
the macro, we have (to indicate that some parts 
of it will be provided as arguments when the 
macro is called. Then when the macro is cailed 
we have to provide actual arguments to be 
plugged into the definition. 


Let us illustrate by defining a macro SM 
that will print its argument two points smaller 
than the surrounding text. That is, the macro 
call 


.SM TROFF 
will produce TROFF. 
The definition of .SM is 


.de SM 
\s~m2\SI\s +2 


ee 


Within a macro definition, the symbol \\Sa 
refers to the ath argument that the macro was 
called with. Thus \\S1 is the string to be placed 
in a smaller point size when .SM is called. 


As 2 slightly more complicated version, the 
following definition of SM permits optional 
second and third arguments that will be printed 
in the normal size: 


.de SM 
WS3\s = 2\\Si\s #2\\S2 


Arguments not provided when the macro is 
called are treated as empty, so 


SM TROFF ), 
produces TROFF), while 
SM TROFF ). ( 


produces (TROFF). It is convenient to reverse 
the order of arguments because trailing punctua- 
tion is much more common than leading. 


By the way, the number of arguments that 
a macro was called with is avasilable in number 
register .5. 


The following macro .BD is the one used 
to make the ‘bold roman’ we have been using 
for tro command names in text. It combines 
horizontal motions, width computations, and 
argument rearrangement. 


.de BD 
V&\\S3\EI\\S1\ A’ —\w \\ Slut lu \\Si\fP\\$2 


The \h and \w commands need no exira 
backslash. as we discussed above. The \& is 
there in case the argument begins with a period. 


Two backslashes are needed with the \\Sa 
commands, though, to protect one of them when 
the macro is being defined. Perhaps a second 
example will make this clearer. Consider a 
macro called SH which produces section head- 
ings rather like those in this paper, with the sec- 
tions numbered automatically, and the title in 
boid in a smaller size. The use is 


SH “Section title ...° 


(If the argument to 2 macro is to contain blanks, 
then it must be surrounded by double quotes, 
unlike a string, where only one leading quote is 
permitted.) 


Here is the definition of the SH macro: 


.ar SH_ 0 \° initialize section number 
.de SH 

sp 0.33 

£:'B 

ene SH \\n(SH#1_ \° increment number 
ps \\a(PS—1 \° decrease PS 
\\a(SH. \\S1 \* number. tide 

-ps \\a(PS \° restore PS 

sp 9.33 

AR 


The section number is kept in number register 
SH, which is incremented each time just before it 
is used. (A number register may have the same 
name as a macro without conflict but a string 
may not) 


We used \\n(SH instead of \n(SH and 
\\a(PS instead of \n(PS. If we had used \n(SH, 
we would get the value of the register at the time 
the macro was defined, not at the time it was 
used, If that's what you want, fine, but not here. 
Similarly, by using \\n(PS, we get the point size 
at the time the macro is called. 


As an example that does not invoive 
numbers, recail our .NP macro which had a 


tf “left’cencter'right’ 
We could make these into parameters by using 
imsieaa 

fl Wek TWelCT\\ (RT 
so the ude comes from three strings called LT, 


CT and RT. [If these are empty. then the title 
will be a blank line. Normaily CT would be set 
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with something like 
ds CT -%- 


to give just the page number between hypnens 
(as on the top of this page). but a user could 
supply private definitions for any of the strings. 


12. Conditionals 


Suppose we want the SH macro to leave 
two extra inches of space just before section |, 
but nowhere else. The cleanest way to do that is 
to test inside the .SH macro whesher the section 
number is 1, and add some space if it is. The .if 
command provides the conditional test that we 
can add just before the heading line is output: 


if \\a(SH @1 sp 2i \* first section oaly 

The condition after the .if can be any 
arithmetic or logical expression. If the condition 
is logically true, or arithmetically greater than 
zero, the rest of the line is treated as if ic were 
text — here 2 command. If the condition is 
false, or zero or negative, the rest of the line is 
skipped. 

It is possible to do more than one com- 
mand if a condition is wue. Suppose several 
operations are to be done before section |. One 
possibility is to define a macro SI and invoke it 
if we are about to do section | (as determined by 


an .if). 


.de Sl 
e- processing for seczion | --~ 


‘de SH 
‘if \\a(SH=1 S1 


An aiternatce way is to use the extended 
form of the .if, like this: 


if \\n(SH =1 \{-- processing 
for section 1 —~\ 


The braces \{ and \} must occur in the positions 
shown or you will get unexpected extra lines in 
your output. troff aiso provides an ‘if-else’ con- 
struction, which we will not go into here. 

A condition can be negated by preceding it 
with !; we get the same effect as above (but less 
clearly) by using 


Af \\n(SH>1 SI 
There are a handful of other conditions 


that can be tested with .if. For exampie. is the 
current page even or odd? 
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Jif e . “even page title” 
.ifo .u “odd page title” 


gives facing pages different tides when used 
inside an appropriate new page macro. 


Two other conditions are t and a, which 
tell you whether the formatter is troff or nroff. 


if ¢ wo stuf ... 
if a arof stuff ... 


Finally, string comparisons may be made 
in an .if: 


if “string)’string2° scuff 
does ‘stuf! if sing! is the same as siring2. The 
character separating the strings can be anything 
reasonable that is not contained in either string. 


The strings themselves can reference strings with 
\e, arguments with \S, and so on. 


13. Environments 


As we mentioned, there is 2 potential 
probiem when going across a page boundary: 
parameters like size and font for a page title may 
well be different from those in effect in the text 
when the page boundary occurs. troff provides a 
very general way to deal with this and similar 
situations. There are three ‘environments’, each 
of which has indeperidently settable versions of 
many of the parameters associated with process- 
ing, including size, font,:line and title lengths, 
Aill/nofill mode, tab stops, and even partially col- 
lected lines. Thus the titling problem may be 
readily solved by processing the main text in one 
environment and tides in a separate one with its 
own suitable parameters. 


The command .ev a shifts to environment 
mn; a must be 0, | or 2. The command .ev with 
no afgument returns to the previous environ- 
ment. Environment names are maintained in a 
stack, so calls for different environments may be 
nested and unwound consistently. 


Suppose we say that the main text is pro- 
cessed in environment 0, which is where troff 
begins by default. Then we can modify the new 
page macro .NP to process titles in environment 
L like this: 


.de NP 

ev | \" shift to new environment 

lt 61 \" set parameters here 

ft R 

-ps 10 

.. any other processing ... 

.ev \" return to previous environment 


It is also possible to initialize the parameters for 
an environment outside the .NP macro, but the 


version shown keeps all the processing in one 
place and is thus easier to understand and 


_ change. 


14. Diversions 


There are numerous occasions in page lay- 
out when it is necessary to store some text for a 
period of time without actually printing it. Foot- 
notes are the most obvious example: the text of 
the footnote usually appears in the input weil 
before the place on the page where it is to be 
printed is reached. In fact, the place where it is 
output normally depends on how big it is, which 
implies that there must be a way to process the 
footnote at least enough to decide its size 
without printing it. 


trol provides a mechanism cailed a diver- 
sion for doing this processing. Any part of the 
output may be diverted into a macro instead of 
being printed, and then at some convenient time 
the macro may be put back into the input. 


The command .di xy begins a diversion — 
all subsequent output is collected into the macro 
xy until the command .di with no arguments is 
encountered. This terminates the diversion. 
The processed text is available at any time 
thereafter, simply by giving the command 


XY 


The vertical size of the last finished diversion is 
contained in the built-in number register dn. 


As a simple example, suppose we want to 
implement a ‘keep-release’ operation. so that 
text between the commands .KS and .KE will not 
be split across a page boundary (as for a figure or 
table). Clearly, when a .KS is encountered, we 
have to begin diverting the output so we can find 
out how big it is. Then when a .XE is seen, we 
decide whether the diverted text will fic on the 
current page, and print it either there if it fits, or 
at the top of the next page if it doesn't. So: 


de KS = \" start keep 


br \* start fresh line 
ev i \* collect in new environment 
fi \" make it filled text 


‘di XX — \" collect in XX 


oo 


.de KE \" end keep 


-or \* get last partial tine 

di \" end diversina 

Af \\nldn> =\\n.t.bp \" bp if doesn’c tit 
.at \* bring it back in no-fill 

XX \" text 

ev \" return to normal environment 


Recail that number register al is the current 


position on the output page. Since output was 
being diverted. this remains at its value when the 
diversion started. dn is the amount of text in 
the diversion: .t (another built-in register) is the 
distance to the next trap. which we assume is at 
the bottom margin of the page. If the diversion 
is large enough to go past the trap, the .if is 
satisfied, and a .bp is issued. Im either case. the 
divested output is then brought back with .XX. Ic 
is essential to bring it back in no-fill mode so 
troff will do no further processing on it. 


This is not the most general keep-release, 
nor is it robust in the face of all conceivable 
inputs, but it would require more space than we 
have here to write i¢ in full generality. This sec- 
tion is nok intended to teach everything about 
diversions, but to sketch out enough that you 
can read existing macro packages with some 
comprehension. 
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Appendix A: Phototypesetter Character Set 


These characters exist in roman, italic, and bold. To get the one on the left, type the four-character 
name on the right. 


ff \(f fi \(fi fl \(fl fi \(Fi fl \(FI 


—~ \(ru — \(em Mm \(14  \(12 % \(34 
© \(co ° \(de t \(dg '  \(fm ¢ \(ct 
® \(rg @ \(bdu Q \(sq - \(hy 


(In bold, \(sq is # ) 


The following are special-font characters: 


+ \(pl - \(mi x  \(mu + \(di 
= \(eq = \(== 2 \(>= <€ \(<= 
z= \(t= + \(+- -  \(no / \(si 
~ \(ap = (\C=- « \(pt V7 \r 
— \(-> — \(<- t \ua { \(da 
Sf \dis a \(pd oo = (\ (if Vv \(sr 
Cc \(sb > \(sp Us \(cu Nn \(ca 
Cc \(ib 2 \(ip € \(mo @ \(es 
°  \(aa * — \(ga Oo \i @ \(bs 
§ \(se + \(dd “we \(ih mw \(rh 
{ \tlt ) \(rt [ \dec } 3 \@e 
{ = \db J) \@b i \df J) \Gf 
{ \ (ik } \(rk | \(bv ss \(ts 
| \(br | = \Cor _ \al ~  \(rn 
* \ (o= 


These four characters also have two-character names. The ° is the apostrophe on terminals; the ° is the 
other quote mark. 


\; ne} 2 NS \ 


These characters exist only on the special font, but they do not have four-character names: 
aaa Vrs < > ~ * \ # @ 
For greek, precede the roman letter by \(* to get the corresponding greek; for example, \(¢a is a. 


abgdezyhikIimncoprstufxqw 
aBySeCnOtKaApviotpaotvd?xbw 


ABGDEZYHIKLMNCOPRSTUFXQW 
ABFAEZHOIKAMNEONMPLTYOX¥ 
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Brian W. Kernighan and Lorinda L. Cherry 


Bell Laboratories 
Murray Hill, New Jersey 07974 


ABSTRACT 


This paper describes the design and implementation of a system for typesetting 
mathematics. The language has been designed to be easy to learn and to use by people 
(for example. secretaries and mathematical typists) who know neither mathematics nor 
typesetting. Experience indicates that the language can be learned in an hour or so, for 
it has few rules and fewer exceptions. For typical expressions, the size and font 
changes, positioning, line drawing, and the like necessary to print according to 
mathematical conventions are all done automatically. For example, the input 


sum from i=0 to infinity x subi = pi over 2 
produces 


nad T 
ia 


The syntax of the language is specified by a small context-free grammar, a 
compiler-compiler is used to make a compiler that translates this language into typeset- 
ting commands. Output may be produced on either a phototypesetter or on a terminal 
with forward and reverse half-line motions. The system interfaces directly with text 
formatting programs, so mixtures of text and mathematics may be handled simply. 


This paper is a revision of a paper originally published in CACM, March, 1975. 


character of mathematics, which the superscript 


**Mathematics is known in the trade as 
difficult, or penalty, copy because it is slower, 
more difficult, and more expensive to set in type 
than any other kind of copy normally occurring 
in books and journals.”* [1] 


One difficulty with mathematical text is the 
multiplicity of characters, sizes, and fonts. An 
expression such as 

lim (tan x sin av ae | 

x—mw/2 
requires an intimate mixture of roman, italic and 
greek letters, in three sizes, and a special charac- 
ter or two. (‘*Requires’’ is perhaps the wrong 
word, but mathematics has its own typographical 
conventions which are quite different from those 
of ordinary text.) Typesetting such an expression 
by traditional methods is still an essentially 
manual operation. 


A second difficulty is the two dimensional 


and limits in the preceding example showed in its 
simplest form. This is carried further by 





by 
aor 
b; 
a\+ 
b; 
8 34+——— 
a3+ ar ot 
and still further by 
og Yacw Vae™=—J/b 
Ia Vae™+ Jb 6 
va 
tanh*! em 
J —-8 << "java an “ee ) 


i | 


PP ae (fe 


These examples also show line-drawing, built-up 
characters like braces and radicals. and a spec- 
trum of positioning problems. (Section 6 shows 
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what a user has to type to produce these on our 
system.) 


2. Photocomposition 


Photocomposition techniques can be used 
to solve some of the problems of typesetting 
mathematics. A phototypesetter is a device 
which exposes a piece of photographic paper or 
film, placing characters wherever they are 
wanted. The Graphic Systems phototypesetter{2] 
on the UNIX operating system(3] works by shin- 
ing light through a character stencil. The charac- 
ter is made the right size by lenses, and the light 
beam directed by fiber optics to the desired place 
on a piece of photographic paper. The exposed 
paper is developed and typically used in some 
form of photo-offset reproduction. 


On UNIX, the phototypesetter is driven by 
a formatting program called TROFF (4]. TROFF 
was designed for setting running text. I[t also 
provides all of the facilities that one needs for 
doing mathematics, such as arbitrary horizontal 
and vertical motions, line-drawing, size changing, 
but the syntax for describing these special opera- 
tions is difficult to learn, and difficult even for 
experienced uSers to type correctly. 


For this reason we decided to use TROFF 
as an “‘assembly language,”’ by designing a 
language for describing mathematical expres- 
sions, and compiling it into TROFF. 


3. Language Design 


The fundamental principle upon which we 
based our language design is that the language 
should be easy to use by people (for example, 
secretaries) who know neither mathematics nor 
typesetting. 


This principle implies several things. First, 
‘*normal’’ mathematical conventions about 
operator precedence, parentheses, and the like 
cannot be used, for to give special meaning to 
such characters means that the user has to 
understand what he or she is typing. Thus the 
language should not assume, for instance, that 
parentheses are always balanced, for they are not 
in the half-open interval (@,b]. Nor should it 
assume that that /a+6 can be replaced by 
(a+b)*, of that 1/(lex) is better written as 
an (or vice versa). 
l=-x 

Second, there should be relatively few 
rules, keywords, special symbols and operators, 
and the like. This keeps the language easy to 
learn and remember. Furthermore. there should 
be few exceptions to the rules that do exist: if 
something works in one situation, it should work 
everywhere. If a variable can have a subscript, 
then a subscript can have a subscript, and so on 


without limit. 


Third. ‘“‘standard’’ things should happen 
automatically. Someone who types 
““xasy+z+1°° should get ‘‘x=v+r+1"". Sub- 
scripts and superscripts should automatically be 
printed in an appropriately smaller size, with no 
special intervention. Fraction bars have to be 
made the right length and positioned at the right 
height. And so on. Indeed a mechanism for 
Overriding default actions has to exist. but its 
application is the exception, not the rule. 


We assume that the typist has a reasonable 
picture (a two-dimensional representation) of the 
desired final form, as might be handwritten by 
the author of a paper. We also assume that the 
input is typed on a computer terminal much like 
an ordinary typewriter. This implies an input 
alphabet of perhaps 100 characters, none of them 
special. 


A secondary, but still important, goal in 
our design was that the system should be easy to 
implement, since neither of the authors had any 
desire to make a long-term project of it. Since 
our design was not firm, it was also necessary 
that the program be easy to change at any time. 


To make the program easy to build and to 
change, and to guarantee regularity (‘‘it should 
work everywhere’’), the language is defined by a 
context-free grammar, described in Section §. 
The compiler for the language was built using a 
compiler-compiler. 


A priori, the grammar/compiler-compiler 
approach seemed the right thing to do. Our sub- 
sequent experience leads us to believe that any 
other course would have been folly. The original 
language was designed in a few days. Construc- 
tion of a working system sufficient to try 
significant examples required perhaps a person- 
month. Since then, we have spent a modest 
amount of additional time over several years 
tuning, adding facilities, and occasionally chang- 
ing the language as users make criticisms and 
suggestions. 


We also decided quite early that we would 
let TROFF do our work for us whenever possible. 
TROFF is quite a powerful program, with a macro 
facility, text and arithmetic variables, numerical 
computation and testing, and conditional branch- 
ing. Thus we have been able to avoid writing a 
lot of mundane but tricky software. For exam- 
ple, we store no text strings, but simply pass 
them on to TROFF. Thus we avoid having to 
write a storage management package. Further- 
more, we have been able to isolate ourselves 
from most details of the particular device and 
character set currently in use. For example. we 
let TROFF compute the widths of all strings of 
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characters; we need know nothing about them. 


A third design goal is special to our 
environment. Since our program is only useful 
for typesetting mathematics, it is necessary that it 
interface cleanly with the underlying typesetting 
language for the benefit of users who want to set 
intermingied mathematics and text (the usual 
case). The standard mode of operation is that 
when a document is typed, mathematical expres- 
sions are input as part of the text, but marked by 
user settable delimiters. The program reads this 
input and treats as comments those things which 
are not mathematics, simply passing them 
through untouched. At the same time it con- 
verts the mathematical input into the necessary 
TROFF commands. The resulting ioutput is 
passed directly to TROFF where the comments 
and the mathematical parts both become text 
and/or TROFF commands. 


4. The Language 


We will not try to describe the language 
precisely here, interested readers may refer to 
the appendix for more details. Throughout this 
section, we will write expressions exactly as they 
are handed to the typesetting program 
(hereinafter called “‘EQN”), except that we won't 
show the delimiters that the user types to mark 
the beginning and end of the expression. The 
interface between EQN and TROFF is described at 
the end of this section. 


, AS we said, typing x =y+z+1 should pro- 
duce x=y+:+1, and indeed it does. Variables 
are made italic, operators and digits become 
roman, and normal spacings between letters and 
operators are altered slightly to give a more 
pleasing appearance. 


Input is free-form. Spaces and new lines 
in the input are used by EQN to separate pieces 
of the input; they are not used to create space in 
the output. Thus 


x ™ y 
+z+1 


also gives x=y+z+1. Free-form input is easier 
to type initially, subsequent editing is also easier, 
for an expression may be typed as many short 
lines. 


Extra white space can be forced into the 
output by several characters of various sizes. A 
tilde ‘*~’° gives a space equal to the normal word 
spacing in text, a circumflex gives half this 
much, and a tab charcter spaces to the next tab 
stop. 


Spaces (or tildes, etc.) also serve to delimit 
pieces of the input. For example, to get 


f (1)m2a f sin (wear 


we write 
f(t) = 2 pi int sin ( omega t )dt 


Here spaces are mecessarv in the input to indicate 
that sin, pi, int, and omega are special, and poten- 
tially worth special treatment. EQN looks up 
each such string of characters in a table, and if 
appropriate gives it a translation. In this case, pr 
and omega become their greek equivalents, in 
becomes the integral sign (which must be moved 
down and enlarged so it looks ‘‘right’’), and sin 
is made roman, following conventional 
mathematical practice. Parentheses, digits and 
Operators are automatically made roman wher- 
ever found. 


Fractions are specified with the keyword 
over: 


a+b overc+dt+e = | 
produces 


a+b af 
c+d+e 


Similarly, subscripts and superscripts are 
introduced by the keywords su6 and sup: 


xttytmes? 
is produced by 
x sup 2 + ysup 2 = zsup 2 


The spaces after the 2°s are necessary to mark 
the end of the superscripts; similarly the keyword 
sup has to be marked off by spaces or some 
equivaient delimiter. The return to the proper 
baseline is automatic. Multiple levels of sub- 
scripts of superscripts are of course allowed: 
“x sup y sup z” is x". The construct ‘‘some- 
thing svwb something sup something’ is recog- 
nized as a special case, so ‘*x sub i sup 2" is x,’ 
instead of x,2. 


More complicated expressions can now be 
formed with these primitives: 
ay x? oy? 
ee ee hoe 
dx? a? bt 
is produced by 


{partial sup 2 f} over {partial x sup 2} = 
x sup 2 over a sup 2 + y sup 2 over b sup 2 


Braces {] are used to group objects together; in 
this case they indicate unambiguously what goes 
over what on the left-hand side of the expres- 
sion. The language defines the precedence of sup 
to be higher than that of over. so no braces are 
needed to get the correct association on the right 
side. Braces can always be used when in doubt 
about precedence. 


The braces convention is an example of 
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the power of using a recursive grammar to define 
the language. It is part of the language that if a 
construct can appear in some context, then avy 
expression in braces can also occur in that con- 
text. 


There is a sqrt operator for making square 
roots of the appropriate size: ‘‘sqrt a+b"’ pro- 
duces Va +6 , and 


x = (—b +— sqrt(b sup 2 —4ac}} over 2a 


ee ~b +Vb*—4ac 

2a 
Since large radicals look poor on our typesetter, 
sqrt is not useful for tall expressions. 


Limits on summations, integrals and simi- 
lar constructions are specified with the keywords 
from and to. To get 


yx.—-0 
= 
we need only type 
sum from i= to inf x subi -—> 0 


Centering and making the = big enough and the 
limits smaller are all automatic. The from and ‘o 
parts are both optional, and the central part (e.z., 
the £) can in fact be anything: 


lim from {x —> pi /2} ( tan*x) = inf 


lim (tan x)=co 
x=w/(2 


Again, the braces indicate just what goes into the 
from part. 


There is a facility for making braces, 
brackets, parentheses, and vertical bars of the 
right height, using the keywords /e/? and right: 


left (x+y over 2a right }”=*1 
makes 


x+y 
2a 


A left need not have a corresponding right, as we 
shall see in the next example. Any characters 
may follow /eft and right, but generally only vari- 
ous parentheses and bars are meaningful. 


Big brackets, etc., are often used with 
another facility, called piles, which make vertical 
piles of objects. For example, to get 








1 if x>0 
0 if x=0 
—1l if «<0 


sign(x) 3 


we can type 


sign (x) ~= =" left { 
rpile {1 above 0 above —1} 
““Ipile (if above if above if} 
““Ipile (x >0 above x =0 above x <0} 


The construction “left {’* makes a left brace big 
enough to enciose the ‘‘rpile {...}'", which is a 
tight-justified pile of ‘‘above ... above ..."". 
“Ipile’’ makes a left-justified pile. There are also 
centered piles. Because of the recursive language 
definition, a pile can contain any number of ele- 
ments; any element of a pile can of course con- 
tain piles. 


Although EQN makes a valiant attempt to 
use the right sizes and fonts, there are times 
when the default assumptions are simply not 
what is wanted. For instance the italic sgn in the 
previous example would conventionally be in 
roman. Slides and transparencies often require 
larger characters than normal text. Thus we also 
provide size and font changing commands: “‘sjze 
12 bold {A°x7="y}" will produce A X = y. 
Size is followed by a number representing a char- 
acter size in points. (One point is 1/72 inch; this 
paper is set in 9 point type.) 

If necessary, an input string can be quoted 
in *...", which turns off grammatical significance, 
and any font or spacing changes that might oth- 
erwise be done on it. Thus we can say 


lim™ roman "sup" “x subn = 0 


to ensure that the supremum doesn’t become a 
superscript: 


lim sup x, =0 


Diacritical marks, long a problem in tradi- 
tional typesetting, are straightforward: 


RHN+54+X+ Yer FZ 
is made by typing 


x dot under + x hat + y tilde 
+ X hat + Y dotdot = z+Z bar 


There are also facilities for globally chang- 
ing default sizes and fonts, for example for mak- 
ing viewgraphs or for setting chemical equations. 
The language allows for matrices, and for lining 
up equations at the same horizontal position. 


Finally, there is a definition facility, so a 
user can say 


define name ”...” 


at any time in the document; henceforth, any 
occurrence of the token ‘‘name’’ in an expres- 
sion will be expanded into whatever was inside 
the double quotes in its definition. This lets 
users tailor the language ‘to their own 


specifications, for it is quite possible to redefine 
keywords like sup or over. Section 6 shows an 
example of definitions. 


The EQN preprocessor reads intermixed 
text and equations, and passes its output to 
TROFF. Since TROFF uses lines beginning with a 
period as control words (e.g., ‘‘.ce’’ means 
“center the next output line’’), EQN uses the 
sequence ‘‘.EQ’’ to mark the beginning of an 
equation and ‘‘.EN”’’ to mark the end. The 
** EQ” and ‘*.EN”’ are passed through to TROFF 
untouched, so they can also be used by a 
knowledgeable user to center equations, number 
them automatically, etc. By default, however, 
** EQ” and ‘*.EN”’ are simply ignored by TROFF, 
so by default equations are printed in-line. 


** EQ” and ‘*.EN’’ can be supplemented 
by TROFF commands as desired; for example, a 
centered display equation can be produced with 
the input: 


.ce 
.EQ 
x subi = y subi... 
.EN 


Since it is tedious to type ‘*.EQ’’ and 
‘“EN”’* around very short expressions (single 
letters, for instance), the user can also define 
two characters to serve as the left and right del- 
imiters of expressions. These characters are 
recognized anywhere in subsequent text. For 
example if the left and right delimiters have both 
been set to ‘°#’’, the input: 


Let #x sub i#, #y# and #alpha# be positive 
produces: 


Let x,, y and a be positive 


Running a preprocessor is strikingly easy 
on UNIX. To typeset text stored in file ‘“‘f"’, one 
issues the command: 


eqn f | troff 


The vertical bar connects the output of one pro- 
cess (EQN) to the input of another (TROFF). 


5. Language Theory 


The basic structure of the language is not a 
particularly original one. Equations are pictured 
as a set of ‘*boxes,”” pieced together in various 
ways. For example, something with a subscript 
is just a box followed by another box moved 
downward and shrunk by an appropriate amount. 
A fraction is just a box centered above another 
box, at the right altitude, with a line of correct 
length drawn between them. 


The grammar for the language is shown 
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below. For purposes of exposition, we have col- 
lapsed some productions. In the original gram- 
mar, there are about 70 productions. but many 
of these are simple ones used only to guarantee 
that some keyword is recognized early enough in 
the parsing process. Symbols in capital letters 
are terminal symbols; lower case symbols are 
non-terminals, i.e., syntactic categories. The 
vertical bar | indicates an alternative, the brack- 
ets [ ] indicate optional material. A TEXT is a 
string of non-blank characters or any string 
inside double quotes. the other terminal symbols 
represent literal occurrences of the corresponding 
keyword. 


eqn : box| eqn box 


box : text 
| { eqn } 
| box OVER box 
| SQRT box 
| box SUB box | box SUP box 
| (L| C{R JPILE { list } 
| LEFT text eqn [ RIGHT text ] 
| box [ FROM box ] [ TO box } 
| SIZE text box 
| (ROMAN | BOLD | ITALIC] box 


| box [HAT | BAR] DOT] DOTDOT | TILDE} 


| DEFINE text text 
list : eqn| list ABOVE eqn 
text : TEXT 


The grammar makes it obvious why there 
are few exceptions. For example, the observa- 
tion that something can be replaced by a more 
complicated something in braces is implicit in the 
productions: 


eqn : box | eqn box 
box : text| { eqn } 


Anywhere a single character could be used, any 
legal construction can be used. 


Clearly, our grammar is highly ambiguous. 
What, for instance, do we do with the input 


aoverboverc ? 
Is it 

{a over b] overc 
or is it 

aover (boverc} ? 

To answer questions like this, the grammar 
is supplemented with a small set of rules that 
describe the precedence and associativity of 
operators. In particular, we specify (more or less 
arbitrarily) that over associates to the left, so the 


first alternative above is the one chosen. On the 
other hand, swb and sup bind to the right, 
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because this is closer to standard mathematical 
‘ ‘ &’. a 
practice. That is, we assume x’ is x’ ’, not 
(x7)? 
The precedence rules resolve the ambiguity 
in a construction like 


a sup 2 over b 


We define sup to have a higher precedence than 
over, so this construction is parsed as "3 instead 


of a?. 


Naturally, a user can always force a partic- 
ular parsing by placing braces around expres- 
sions. 


The ambiguous grammar approach seems 
to be quite useful. The grammar we use is small 
enough to be easily understood, for it contains 
none of the productions that would be normally 
used for resolving ambiguity. Instead the sup- 
plemental information about precedence and 
associativity (also small enough to be under- 
stood) provides the compiler-compiler with the 
information it needs to make a fast, deterministic 
parser for the specific language we want. When 
the language is supplemented by the disambi- 
guating rules, it is in fact LR(1) and thus easy to 
parse(5]. 


The output code is generated as the input 
is scanned. Any time a production of the gram- 
mae is recognized, (potentially) some TROFF 
commands are output. For example, when the 
lexical analyzer reports that it has found a TEXT 
(i.e., a string of contiguous characters), we have 
recognized the production: 


text : TEXT 


The translation of this is simple. We generate a 
local name for the string, then hand the name 
and the string to TROFF, and let TROFF perform 
the storage management. All we save is the 
name of the string, its height, and its baseline. 


As another example, the translation associ- 
ated with the production 


box : box OVER box 
is: 


Width of output box = 
slightly more than largest input width 
Height of output box = 
slightly more than sum of input heights 
Base of output box = 
slightly more than height of bottom input box 
String describing output box = 
move down, 
move right enough to center bottom box, 


draw bottom box (i.e., copy string for bottom box). 


move up: move left enough to center top box; 
draw top box (i.e.. copy string for top box), 
move down and left; draw line full width: 
return to proper base line. 


Most of the other productions have equally sim- 
ple semantic actions. Picturing the output as a 
set of properly placed boxes makes the right 
sequence of positioning commands quite obvi- 
ous. The main difficulty is in finding the right 
numbers to use for esthetically pleasing position- 
ing. 

With a grammar, it is usually clear how to 
extend the language. For instance, one of our 
users suggested a TENSOR operator, to make 
constructions like 


! a 
T 
nm ni 
Grammatically, this is easy: it is sufficient to add 
a production like 


box : TENSOR [ list } 


Semantically, we need only juggle the boxes to 
the right places. 


6. Experience 


There are really three aspects of 
interest~how well EQN sets mathematics, how 
well it satisfies its goal of being “‘easy to use,” 
and how easy it was to build. 


The first question is easily addressed. This 
entire paper has been set by the program. 
Readers can judge for themselves whether it is 
good enough for their purposes. One of our 
users commented that although the output is not 
as good as the best hand-set material. it is still 
better than average, and much better than the 
worst. In any case, who cares? Printed books 
cannot compete with the birds and flowers of 
illuminated manuscripts on esthetic grounds, 
either, but they have some clear economic 
advantages. 


Some of the deficiencies in the output 
could be cleaned up with more work on our part. 
For example, we sometimes leave too much 
space between a roman letter and an italic one. 
If we were willing to keep track of the fonts 
involved, we could do this better more of the 
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time. 


Some other weaknesses are inherent in our 
output device. It is hard, for instance, to draw a 
line of an arbitrary length without getting a per- 
ceptible overstrike at one end. 


As to ease of use, at the time of writing, 
the system has been used by two distinct groups. 
One user population consists of mathematicians, 
chemists, physicists. and computer scientists. 
Their typical reaction has been something like: 


(1) It’s easy to write, although I make the fol- 
lowing mistakes... 


(2) How do I do...? 


(3) It botches the following things.... Why 
don’t you fix them? 


(4) You really need the following features... 


The learning time is short. A few minutes 
gives the general flavor, and typing a page or two 
of a paper generally uncovers most of the 
misconceptions about how it works. 


The second user group is much larger, the 
secretaries and mathematical typists who were 
the original target of the system. They tend to 
be enthusiastic converts. They find the language 
easy to learn (most are largely self-taught), and 
have little trouble producing the output they 
want. They are of course less critical of the 
esthetics of their output than users trained in 
mathematics. After a transition period, most 
find using a computer more interesting than a 
regular typewriter. 


The main difficulty that users have seems 
to be remembering that a blank is a delimiter; 
even experienced users use blanks where they 
shouldn't and omit them when they are needed. 
A common instance is typing 


f(x sub i) 
which produces 


FX, 
instead of 


f(x) 


Since the EQN language knows no mathematics, 
it cannot deduce that the right parenthesis is not 
part of the subscript. 


The language is somewhat prolix, but this 
doesn’t seem excessive considering how much is 
being done, and it is certainly more compact than 
the corresponding TROFF commands. For exam- 
ple, here is the source for the continued fraction 
expression in Section | of this paper: 


asub 0 + b sub I over 
{a sub 1 + b sub 2 over 
{a sub 2 + b sub 3 over 
{asub 3 + ... }}} 


This is the input for the large integral of Section 
1; notice the use of definitions: 


define emx “{e sup mx)" 
define mab "({m sqrt ab}” 
define sa "{sqrt a)” 
define sb “{sqrt b}” 
int dx over {a emx — be sup ~mx]} “=~ 
left { Ipile { 
1 over {2 mab] “log” 
{sa emx — sb} over {sa emx + sb} 
above 
1 over mab ~ tanh sup —1 (sa over sb emx ) 
above 
~1 over mab ~ coth sup —1 (sa over sb emx ) 


As to ease of construction, we have 
already mentioned that there are really only a 
few person-months invested. Much of this time 
has gone into two things—fine-tuning (what is 
the most esthetically pleasing space to use 
between the numerator and denominator of a 
fraction?), and changing things found deficient 
by our users (shouldn't a tilde be a delimiter?). 


The program consists of a number of 
small, essentially unconnected modules for code 
generation, a simple lexical analyzer, a canned 
parser which we did not have to write, and some 
miscellany associated with input files and the 
macro facility. The program is now about 1600 
lines of C [6], a high-level language reminiscent 
of BCPL. About 20 percent of these lines are 
‘‘print™ statements, generating the output code. 


The semantic routines that generate the 
actual TROFF commands can be changed to 
accommodate other formatting languages and 
devices. For example, in less than 24 hours, one 
of us changed the entire semantic package to 
drive NROFF, a variant of TROFF, for typesetting 
mathematics on teletypewriter devices capable of 
reverse line motions. Since many potential users 
do not have access to a typesetter, but still have 
to type mathematics, this provides a way to get a 
typed version of the final output which is close 
enough for debugging purposes, and sometimes 
even for ultimate use. 


7. Conclusions 


We think we have shown that it is possible 
to do acceptably good typesetting of mathematics 
on a phototypesetter, with an input language that 
is easy to learn and use and that satisfies many 
users’ demands. Such a package can be imple- 
mented in short order, given a compiler-compiler 
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and a decent typesetting program underneath. 


Defining a language. and building a com- 
piler for it with a compiler-compiler seems like 
the only sensible way to do business. Our 
experience with<the use of a grammar and a 
compiler-compiler has been uniformly favorable. 
If we had written everything into code directly, 
we would have been locked into our original 
design. Furthermore, we would have never been 
sure where the exceptions and special cases were. 
But because we have a grammar, we can change 
our minds readily and.still Be reasonably sure 
that if a construction works in one place it will 
work everywhere. 
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1. Introduction 


EQN is a program for typesetting 
mathematics on the Graphics Systems pho- 
totypesetters on UNIX and GCOS. The EQN 
language was designed to be easy to use by 
people who know neither mathematics nor 
typesetting. Thus EQN knows relatively little 
about mathematics. In particular, 
mathematical symbols like +, —-, xX, 
parentheses, and so on have no special 
meanings. EQN is quite happy to set garbage 
(but it will look good). 


EQN works as a preprocessor for the 
typesetter formatter, TROFF[1], so the nor- 
mai mode of operation is to prepare a docu- 
ment with both mathematics and ordinary 
text interspersed, and let EQN set the 
mathematics while TROFF does the body of 
the text. 


On UNIX, EQN will also produce 
mathematics on DASI and GSI terminals and 


on Modei 37 teletypes. The input is identi- 


cal, but you have to use the programs NEQN 
and NROFF instead of EQN and TROFF. Of 
course, some things won't look as good 
because terminals don't provide the variety 
of characters, sizes and fonts that a 
typesetter does, but the output is usually 
adequate for proofreading. 


To use EQN on UNIX, 
eqn files | troff 


GCOS use is discussed in section 26. 


2. Displayed Equations 


To tell EQN where a mathematical 
expression begins and ends, we mark it with 
lines beginning .£Q and .EN. Thus if you 
type the lines 


EQ 
X™VvV+2Z 
.EN 


your Output will look like 
xmyrs 


The .—Q and .EN are copied through 
untouched: they are not otherwise processed 
by EQN. This means that you have to take 
care of things like centering, numbering, 
and so on yourself. The most common way 
is to use the TROFF and NROFF macro pack- 
“age package ‘-ms’ developed by M. E. 
Lesk(3]. which allows you to center, indent, 
left-justify and number equations. 


With the ‘—ms’ package, equations are 
centered by default. To left-justify an equa- 
tion, use .EQ L instead of EQ. To indent it. 
use .EQI. Any of these can be followed by 
an arbitrary ‘equation number’ which will be 
placed at the right margin. For example, 
the input 


-EQ I (3.1a) 


x = f(y/2) + v/2 
.EN 


produces the output 
x= f (y/2)+y/2 (3.1a) 


There is also a shorthand notation so 
in-line expressions like w* can be entered 
without —£Q and EN. We will talk about it in 
section 19. 


3. Input spaces 


Spaces and newlines within an expres- 
sion are thrown away by EQN. (Normal text 
is left absolutely alone.) Thus between £Q 
and .EN, 


xy FZ 
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and 
x™y+z 
and 
x ™ y 
oz 


and so on all produce the same output 
x=yrz 


You should use spaces and newlines freely 
to make your input equations readable and 
easy to edit. In particular, very long lines 
are a bad idea, since they are often hard to 
fix if you make a mistake. 


4. Output spaces 


To force extra spaces into the ouzput, 
use a tilde **~’’ for each space you want: 


x7 ey” +z 
gives 
x=yrt2 


You can also use a circumflex ‘‘°’’, which 
gives a space half the width of a tilde. It is 
mainly useful for fine-tuning. Tabs may 
also be used to position pieces of an expres- 
sion, but the tab stops must be set by TROFF 
commands. 


5. Symbols, Special Names, Greek 


EQN knows some mathematical sym- 
bols, some mathematical names, and the 
Greek alphabet. For example, 


x™2 pi int sin ( omega t)dt 
produces 
x=2ar f sin(w t)dt 


Here the spaces in the input are necessary 
to tell EQN that int, pi, sin and omega are 
separate entities that should get special 
treatment. The sin, digit 2, and parentheses 
are set in roman type instead of italic; pi and 
omega are made Greek; and int becomes the 
integral sign. 


When in doubt, leave spaces around 
separate parts of the input. A very common 
error is to type /(pi) without leaving spaces 
on both sides of the pi. As a result, EQN 
does not recognize pi as a special word, and 
it appears as f (pi) instead of f (1). 


A complete list of EQN names appears 
in section 23. Knowledgeable users can also 
use TROFF four-character names for any- 
thing EQN doesn’t know about, like \ (bs for 
the Bell System sign ©. 


6. Spaces, Again 

The only way EQN can deduce that 
some sequence of letters might be special is 
if that sequence is separated from the letters 
on either side of it. This can be done by 
surrounding a special word by ordinary 
spaces (or tabs or newlines), as we did in 
the previous section. 


You can also make special words stand 
out by surrounding them with tildes or 
circumflexes: 


x” =°2°piint sin” (omega7t”) “dt 
is much the same as the last example, 
except that the tildes not only separate the 


magic words like sin, omega, and so on, but 
also add extra spaces, one space per tilde: 


x=2a fsin(wr) dt 


Special words can also be separated by 
braces { |} and double quotes *...", which 
have special meanings that we will see soon. 


7. Subscripts and Superscripts 
Subscripts and superscripts re 
obtained with the words sub and sup. 
X sup 2 + y subk 
gives 
x*+y, 


EQN takes care of all the size changes and 
vertical motions needed to make the output 
look right. The words sub and sup must be 
surrounded by spaces; x sub2 will give you 
xsub2 instead of x, Furthermore, don’t 
forget to leave a space (or a tilde, etc.) to 
mark the end of a subscript or superscript. 
A common error is to say something like 


y = (x sup 2) +1 
which causes 
y=(x?)*! 
instead of the intended 


y=(x7)+1 
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Subscripted subscripts and  super- 
scripted superscripts also work: 


x sub i sub | 


xX 


A subscript and superscript on the same 
thing are printed one above the other if the 
subscript comes first: 


x sub i sup 2 


x? 


Other than this special case, sub and 
sup group to the right, so xsupy subz 
means x”, not x’,. 


8. Braces for Grouping 


Normally, the end of a subscript or 
superscript is marked simply by a blank (or 
tab or tilde, etc.) What if the subscript or 
superscript is something that has to be typed 
with blanks in it? In that case, you can use 
the braces { and } to mark the beginning and 
end of the subscript or superscript: 


e sup {i omega t} 
giv 


Rule: Braces can always be used to force 
EQN to treat something as a unit, or just to 
make your intent perfectly clear. Thus: 


x sub (i sub 1} sup 2 
oh 
with braces, but 


x sub i sub | sup 2 


j2 
which is rather different. 


Braces can occur within braces if 
necessary: 


e sup {i pi sup (rho +1)}} 


The general! rule is that anywhere you could 
use some single thing like x, you can use an 
arbitrarily complicated thing if you enclose it 
in braces. EQN will look after all the details 
of positioning it and making it the right size. 


In all cases, make sure you have the 
right number of braces. Leaving one out or 
adding an extra will cause EQN to complain 
bitterly. 

Occasionally you will have to print 
braces. To do this, enclose them in double 
quotes, like "(". Quoting is discussed in 
more detail in section 14. 


9. Fractions 
To make a fraction, use the word over: 
a+b over 2c #1 
gives 
a+b 
2c 


The line is made the right length and posi- 
tioned automatically. Braces can be used to 
make clear what goes over what: 


{alpha + beta} over {sin (x)} 


=! 





2t8 

sin(x) 
What happens when there is both an over 
and a sup in the same expression? In such 
an apparently ambiguous case, EQN does the 
sup before the over, so 


—b sup 2 over pi 
—p2 2 
is a instead of —57 The rules which 


decide which operation is done first in cases 
like this are summarized in section 23. 
When in doubt, however, use braces to 
make clear what goes with what. 


10. Square Roots 
To draw a square root, use sqrt: 


sqrtat+b + 1 over sqrt (ax sup 2 +bx +c} 
is 


Jan eae 
Vax*+bx+c 
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Warning — square roots of tall quantities 
look lousy, because a root-sign big enough 
to cover the quantity is too dark and heavy: 


sqrt {a sup 2 over b sub 2} 


a 

b2 
Big square roots are generally better written 
as something to the power |: 


(a / b>) a 
which is 
(a sup 2 /b sub 2 ) sup half 


11. Summation, Integral, Etc. 
Summations, integrals, and similar 
constructions are easy: 
sum from i=0 to {i= inf} x sup i 


produces 


2x 
i=0 


Notice that we used braces to indicate where 
the upper part j=co begins and ends. No 
braces were necessary for the lower part 
imQ, because it contained no blanks. The 
braces will never hurt, and if the /rom and to 
parts contain any blanks, you must use 
braces around them. 


The from and to parts are both 
optional, but if both are used, they have to 
occur in that order. 


Other useful characters can replace the 
sum in our example: 


int prod union inter 
become, respectively, 


f Dd uon 


Since the thing before the from can be any- 
thing, even something in braces, /rom-to can 
often be used in unexpected ways: 


lim from {n —> inf} x subn =0 
is 


lim x,=0 
nce 


12. Size and Font Changes 


By default, equations are set in 10- 
point type (the same size as this guide), 
with standard mathematical conventions to 
determine what characters are in roman and 
what in italic. Although EQN makes a vali- 
ant attempt to use esthetically pleasing sizes 
and fonts, it is not perfect. To change sizes 
and fonts, use size n and roman, italic, bold 
and fat. Like sub and sup, size and font 
changes affect only the thing that follows 
them, and revert to the normal situation at 
the end of it. Thus 


bold x y 


iS 


xy 
and 
size 14 bold x my + 
size 14 {alpha + beta] 
gives 


X=y+atB 


As always, you can use braces if you want to 
affect something more complicated than a 
single letter. For example, you can change 
the size of an entire equation by 


size 12 {... } 


Legal sizes which may follow size are 
6, 7, 8, 9, 10, 11, 12, 14, 16, 18, 20, 22, 24, 
28, 36. You can also change the size by a 
given amount; for example, you can say 
size +2 to make the size two points bigger, 
or size—3 to make it three points smaller. 
This has the advantage that you don’t have 
to know what the current size is. 


If you are using fonts other than 
roman, italic and bold, you can say font X 
where X is a one character TROFF name or 
number for the font. Since EQN is tuned for 
roman, italic and bold, other fonts may not 
give quite as good an appearance. 


The fae operation takes the current 
font and widens it by overstriking: far grad is 
V and fart (x sub j is x,;. 


If an entire document is to be in a 
non-standard size or font, it is a severe nui- 
sance to have to write out a size and font 
change for each equation. Accordingly, you 
can set a ‘“‘global’’ size or font which 
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thereafter affects all equations. At the 
beginning of any equation, you might say, 
for instance, 


.EQ 
gsize 16 
gfont R 
EN 


to set the size to 16 and the font to roman 
thereafter. In place of R, you can use any 
of the TROFF font names. The size after 
gsize can be a relative change with + or —. 


Generally, gsize and g/font will appear at 
the beginning of a document but they can 
also appear thoughout a document: the glo- 
bal font and size can be changed as often as 
needed. For example, in a footnote? you 
will typically want the size of equations to 
match the size of the footnote text, which is 
two points smaller than the main text. 
Don’t forget to reset the global size at the 
end of the footnote. 


13. Diacritical Marks 


To get funny marks on top of letters, 
there are several words: 


x dot x 
x dotdot x 
x hat x 
x tilde x 
x vec x 
x dyad x 
x bar x 
x under x 


The diacritical mark is placed at the right 
height. The dar and under are made the 
right length for the entire construct, as in 
xtyvz; other marks are centered. 


14, Quoted Text 


Any input entirely within quotes 
("...") is not subject to any of the font 
changes and spacing adjustments normally 
done by the equation setter. This provides a 
way to do your own spacing and adjusting if 
needed: 


tLike this one, in which we have a few random 
expressions like x, and w*. The sizes for these 
were set by the command gsize — 2. 


italic "sin(x)" + sin (x) 


sinGed) +sin(x) 


Quotes are also used to get braces and 
other EQN keywords printed: 


"{ size alpha }” 


{ size alpha } 


roman "{ size alpha }" 


{ size alpha } 


The construction ”" is often used as a 
place-holder when grammatically EQN needs 
something, but you don’t actually want any- 
thing in your output. For example, to make 

e, you can’t just type sup 2 roman He 
because a sup has to be a superscript on 
something. Thus you must say 


"" sup 2 roman He 


To get a literal quote use ‘‘\"" TROFF 
characters like \(bs can appear unquoted, 
but mofe complicated things like horizontal 
and vertical motions with \# and \v should 
always be quoted. (If you’ve never heard of 
\Aand \v, ignore this section.) 


15. Lining Up Equations 


Sometimes it’s necessary to line up a 
series of equations at some horizontal posi- 
tion, often at an equals sign. This is done 
with two operations called mark and lineup. 


The word mark may appear once at 
any place in an equation. It remembers the 
horizontal position where it appeared. Suc- 
cessive equations can contain one 
occurrence of the word lineup. The place 
where /ineup appears is made to line up with 
the place marked by the previous mark if at 
all possible. Thus, for example, you can say 
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EQ I 

x+y mark ™ z 
.EN 

EQ I 

xX lineup = | 
.EN 


to produce 
xty=2 
x=] 
For reasons too complicated to talk about, 
when you use EQN and ‘—ms’, use either 
QI or £—QL. mark and lineup don’t work 


with centered equations. Also bear in mind 
that mark doesn’t look ahead; 


x mark =] 


x+y lineup =z 
isn’t going to work, because there isn’t 


room for the x+y part after the mark 
remembers where the xis. 


16. Big Brackets, Etc. 


To get big brackets [], braces { }, 
parentheses (), and bars || around things, 
use the /eftand right commands: 


left {a over b + | right } 
“=” left (cc over d right ) 
+ left [e right ] 











4 - +|d 


The resulting brackets are made big enough 
to cover whatever they enclose. Other char- 
acters can be used besides these, but the are 
not likely to look very good. One exception 
is the floor and ceiling characters: 


= 
d 


left floor x over y right floor 
<= left ceiling a over b right ceiling 


produces 


x= 
y 


2 


“5 














Several warnings about brackets are in 
order. First, braces are typically bigger than 
brackets and parentheses, because they are 
made up of three, five, seven, etc., pieces, 
while brackets can be made up of two, 


three, etc. Second, big left and right 
parentheses often look poor, because the 
character set is poorly designed. 


The right part may be omitted: a “‘left 
something’’ need not have a corresponding 
**right something’. If the rigAr part is omit- 
ted, put braces around the thing you want 
the left bracket to encompass. Otherwise, 
the resulting brackets may be too large. 


If you want to omit the /eff part, things 
are more complicated, because technically 
you can’t have a right without a correspond- 
ing left. Instead you have to say 


left "" ..... right ) 


for example. The /efr"" means a ‘“‘left noth- 


ing’’. This satisfies the rules without hurt- 
ing your output. 


17. Piles 


There is a general facility for making 
vertical piles of things; it comes in several 
flavors. For example: 


A ~=" left [ 
pile { a above b above c } 
“" pile { x above y above z } 


right ] 
a 
by 
cz 


The elements of the pile (there can be as 
many as you want) are centered one above 
another, at the right height for most pur- 
poses. The keyword above is used to 
separate the pieces; braces are used around 
the entire list. The elements of a pile can 
be as complicated as needed, even contain- 
ing more piles. 


Three other forms of pile exist: (pile 
makes a pile with the elements left-justified: 
rpile makes a right-justified pile; and cpile 
makes a centered pile, just like pile. The 
vertical spacing between the pieces is some- 
what larger for /-, r- and cpiles than it is for 
ordinary piles. 


will make 


A= 





roman sign (x)7 =~ 
left { 
lpile {1 above 0 above —1} 
“™ {pile 
{if’x>0 above if-x==0 above iffx <0} 
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makes 
l ifx>0 
sign(x) = {0 if x=0 
~—1 if «<0 


Notice the left brace without a matching 
right one. 


18. Matrices 


It is also possible to make matrices. 
For example, to make a neat array like 
Xi x? 


y, ¥? 


you have to type 


matrix { 
ccol { x sub i above y sub i } 
ccol { x sup 2 above y sup 2 } 


This produces a matrix with two centered 
columns. The elements of the columns are 
then listed just as for a pile, each element 
separated by the word above. You can also 
use icol or rco/ to left or right adjust 
columns. Each column can be separately 
adjusted, and there can be as many columns 
as you like. 


The reason for using a matrix instead 
of two adjacent piles, by the way, is that if 
the elements of the piles don’t all have the 
same height, they won’t line up properly. A 
matrix forces them to line up, because it 
looks at the entire structure before deciding 
what spacing to use. 


A word of warning about matrices — 
each column must have the same number of 
elements in it. The world will end if you get 
this wrong. 


19. Shorthand for In-line Equations 


In a mathematical document, it is 
necessary to follow mathematical conven- 
tions not just in display equations, but also 
in the body of the text, for example by mak- 
ing variable names like x italic. Although 
this could be done by surrounding the 
appropriate parts with EQ and EN, the con- 
tinual repetition of .EQ and -EN is a nuisance. 
Furthermore, with ‘~ms’, .£EQ and .EN imply 
a displayed equation. 


EQN provides a shorthand for short in- 
line expressions. You can define two char- 
acters to mark the left and right ends of an 
in-line equation, and then type expressions 
right in the middle of text lines. To set 
both the left and right characters to dollar 
signs, for example, add to the beginning of 
your document the three lines 


.EQ 
delim SS 
.EN 


Having done this, you can then say things 
like 


Let Salpha sub iS be the primary 
variable, and let SbetaS be zero. 
Then we can show that 3x sub 1S is 
S> =0S. 


This works as you might expect — spaces, 
newlines, and so on are significant in the 
text, but not in the equation part itself. 
Multiple equations can occur in a single 
input line. 

Enough room is left before and after a 
line that contains in-line expressions that 

n 


something like }\x, does not interfere with 


rm] 


the lines surrounding it. 
To turn off the delimiters, 


.EQ 
delim off 
.EN 


Warning: don’t use braces, tildes, 
circumflexes, or double quotes as delimiters 
— chaos will result. 


20. Definitions 


EQN provides a facility so you can give 
a frequently-used string of characters a 
name, and thereafter just type the name 
instead of the whole string. For example, if 
the sequence 


x subisub 1 + y subi sub 1 


appears repeatedly throughout a paper, you 
can save re-typing it each time by defining it 
like this: 


define xy ‘x subisub 1 + y subi sub 1’ 


This makes xv a shorthand for whatever 
characters occur between the single quotes 
in the definition. You can use any character 
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instead of quote to mark the ends of the 
definition, so long as it doesn’t appear inside 
the definition. 


Now you can use .w like this: 


.EQ 
f(x) = xy... 
.EN 


and so on. Each occurrence of x will 
expand into what it was defined as. Be care- 
ful to leave spaces or their equivalent 
around the name when you actually use it, 
sO EQN will be able to identify it as special. 


There are several things to watch out 
for. First, although definitions can use pre- 
vious definitions, as in 


.EQ 

define xi 'x subi’ 
define xil ' xi sub 1 ' 
.EN 


don’t define something in terms of itself A 
favorite error is to say 


define X ’ roman X' 


This is a guaranteed disaster, since X is now 
defined in terms of itself. If you say 


define X ' roman "X"’ 


however, the quotes protect the second X, 
and everything works fine. 


EQN keywords can be redefined. You 
can make / mean over by saying 


define / ' over’ 
or redefine over as / with 


define over '/' 


If you need different things to print on 
a terminal and on the typesetter, it is some- 
times worth defining a symbol differently in 
NEQN and EQN. This can be done with 
ndefine and tdefine. A definition made with 
ndefine only takes effect if you are running 
NEQN; if you use ¢define, the definition only 
applies for EQN. Names defined with plain 
define apply to both EQN and NEQN. 


21. Local Motions 


Although EQN tries to get most things 
at the right place on the paper, it isn’t per- 
fect, and occasionally you will need to tune 
the output to make it just right. Small extra 


horizontal spaces can be obtained with ulde 
and circumflex. You can also say back n and 
fwd nto move small amounts horizontally 
nis how far to move in 1/100’s of an em 
(an em is about the width of the letter ‘m’.) 
Thus back 30 moves back about half the 
width of an m. Similarly you can move 
things up or down with up nand down n. As 
with sub or sup, the local motions affect the 
next thing in the input, and this can be 
something arbitrarily complicated if it is 
enclosed in braces. 


22. A Large Example 


Here is the complete source for the 
three display equations in the abstract of this 
guide. 


EQ! 

G(z)"mark =" e sup { In ~ G(z) } 

“me” exp left ( 

sum from k> =1 (S sub k z sup k} over k right ) 
“=” prod from k> =1 e sup (S sub k z sup k /k} 
.EN 

EQ! 

lineup = left (1 + Ssublz+ 

{ S sub 1 sup 2.z sup 2} over 2! + ... right ) 

left (1+ {S sub 2 z sup 2} over 2 

+ {S sub 2 sup 2 z sup 4} over { 2 sup 2 cdot 2! } 
+ .. right) ... 

.EN 

QI 

lineup = sum from m> =0 left ( 

sum from 

pile { k sub I ,k sub 2...., k subm >=0 

above 

k sub 1 +2k sub 2 + ... mk subm =m} 

{ S sub | sup {k sub 1} } over {1 sup k sub | k sub !} ~ 
{ S sub 2 sup {k sub 2} } over (2 sup k sub 2 k sub 2! } ~ 


{ S sub m sup {k sub m} } over {m sup k sub m k sub m! | 
right ) z sup m 
.EN 


23. Keywords, Precedences, Etc. 
If you don’t use braces, EQN will do 
Operations in the order shown in this list. 


dyad vec under bar tilde hat dot dotdot 
fwd back down up 

fat roman italic bold size 

sub sup sqrt over 

from to — 


These operations group to the left: 
over sqrt left right 
All others group to the right. 
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Digits, parentheses, brackets, punctua- 
tion marks, and these mathematical words 
are converted to Roman font when encoun- 
tered: 


sin cos tan sinh cosh tanh arc 
max min lim log In exp 
Re Im and if for det 


These character sequences are recognized 
and translated as shown. 


> = 
< = 
! 
+— 
-> 
< -_ 
<< << 
>> >> 
inf oo 
partial re) 
half Y, 
prime 
approx = 
nothing 
cdot 
limes 
del 

grad 


IHRM AW 


“qx 


sum 
int 
prod 
union 
inter 


DCAM. 


To obtain Greek letters, simply spell 
them out in whatever case you want: 


DELTA A iota 


t 
GAMMA [ kappa K 
LAMBDA A lambda IN 
OMEGA mu Be 
PHI ® nu v 
PI Il omega w 
PSI Vv omicron o 
SIGMA phi o 
THETA 9 pi 7 
UPSILON Y psi wy 
XI = rho p 
alpha a sigma o 


beta B tau rg 
chi x theta 9 
delta 5 upsilon v 
epsilon € Xi é 
eta n zeta C 
gamma y 


These are all the words known to EQN 
(except for characters with names), together 
with the section where they are discussed. 


above 17, 18 Ipile 17 
back 21 mark 15 
bar 13 matrix 18 
bold 12 ndefine 20 
ccol 18 over 9 
col 18 pile 17 
cpile 17 rcol 18 
define 20 right 16 
delim 19 roman 12 
dot 13 rpile 17 
dotdot 13 size 12 
down 2k sqrt 10 
dyad 13 sub 7 
fat 12 sup 7 
font 12 tdefine 20 
from 11 tilde 13 
fwd 21 to 11 
gfont 12 under 13 
gsize 12 up 21 
hat 13 vec 13 
italic 12 74 4,6 
Icol 18 {} 8 
left 16 gore 8, 14 
lineup 15 


24. Troubleshooting 


If you make a mistake in an equation, 
like leaving out a brace (very common) or 
having one too many (very common) or 
having a sup with nothing before it (com- 
mon), EQN will tell you with the message 


syntax error between lines x and y, file z 


where x and y are approximately the lines 
between which the trouble occurred, and 71s 
the name of the file in question. The line 
numbers are approximate — look nearby as 
well. There are also seif-explanatory mes- 
sages that arise if you leave out a quote or 
try to run EQN on a non-existent file. 


If you want to check a document 
before actually printing it (on UNIX only), 
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eqn files >/dev/null 


will throw away the output but print the 
messages. 


If you use something like dollar signs 
as delimiters, it is easy to leave one out. 
This causes very strange troubles. The pro- 
gram checkeq (on GCOS, use ./checkeq 
instead) checks for misplaced or missing 
dollar signs and similar troubles. 


In-line equations can only be so big 
because of an internal buffer in TROFF. If 
you get a message ‘‘word overflow’’, you 
have exceeded this limit. If you print the 
equation as a displayed equation this mes- 
sage will usually go away. The message 
‘‘line overflow’ indicates you have 
exceeded an even bigger buffer. The only 
cure for this is to break the equation into 
two separate ones. 


On a related topic, EQN does not break 
equations by itself — vou must split long 
equations up across multiple lines by your- 
self, marking each by a separate .EQ ... .EN 
sequence. EQN does warn about equations 
that are too long to fit on one line. 


25. Use on UNIX 


To print a document that contains 
mathematics on the UNIX typesetter, 


ean files | troff 


If there are any TROFF options, they go after 
the TROFF part of the command. For exam- 
ple, 


eqn files | troff —-ms 


To run the same document on the GCOS 
typesetter, use 


eqn files | troff -g (other options) | gcat 


A compatible version of EQN can be 
used on devices like teletypes and DASI and 
GSI terminals which have half-line forward 
and reverse capabilities. To print equations 
on a Mode! 37 teletype, for example, use 


neqn files | nroff 


The language for equations recognized by 
NEQN is identical to that of EQN. although of 
course the output is more restricted. 


_ To use a GSI or DASI terminal as the 
output device, 


nean files | nroff —T.x 


where x is the terminal type you are using, 
such as 300 or 3008S. 


EQN and NEQN can be used with the 
TBL program(2] for setting tables that con- 
tain mathematics. Use TBL before (NIEQN. 
like this: 


tbl files | eqn | troff 
tol files | neqn | nroff 
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Tbl — A Program to Format Tables 


M. E. Lesk 


Belt Laboratories 
Murray Hill, New Jersey 07974 


Introduction. 


To! turns a simple description of a table into a rroff or nroff [1] program (list of com- 
mands) that prints the table. 7b/ may be used on the pop-11 UNIx [2] system and on the 
Honeywell 6000 Gcos system. It attempts to isolate a portion of a job that it can successfully 
handle and leave the remainder for other programs. Thus /6/ may be used with the equation 
formatting program eqn [3] or various layout macro packages [4;5,6], but does not duplicate 
their functions. 


This memorandum is divided into two parts. First we give the rules for preparing ¢d/ 
input; then some examples are shown. The description of rules is precise but technical, and the 
beginning user may prefer to read the examples first, as they show some common table 
arrangements. A section explaining how to invoke /b/ precedes the examples. To avoid repeti- 
tion, henceforth read troffas “‘troffor nroff.”* 


The input to /é/ is text for a document, with tables preceded by a ‘**.TS** (table scart) 
command and followed by a **. TE” (table end) command. T®/ processes the tables, generating 
trof formatting commands, and leaves the remainder of the text unchanged. The “.TS** and 
**. TE” lines are copied, too, so that sroff page layout macros (such as the memo formatting 
macros [4]) can use these lines to delimit and place tables as they see fit. [n particular, any 
arguments on the **.TS” or **. TE” lines are copied but otherwise ignored, and may be used by 
document layout macro cammands. 


The format of the input is as follows: 


text 
sto 
table 
<lE 
text 
TS 
table 
TE 
text 


where the format of each table is as follows: 


»TS 
options ; 
format . 
data 
TE 


Each table is independent, and must contain formatting information followed by the data to be 
entered in the table. The formatting information, which describes the individual columns and 
rows of the table, may be preceded by a few options that affect the entire table. A detailed 
description of tables is given in the next section. 
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Input commands. 


As indicated above, a table contains, first, global options, then a format section describing 


the layout of the table entries, and then the data to be printed. The format and data are always 
required, but not the options. The various parts of the table are entered as follows: 


1) 


2) 


OPTIONS. There may be a single line of options affecting the whole table. If present, this 
line must follow the .TS line immediately and must contain a list of option names 


separated by spaces, tabs, or commas, and must be terminated by a semicolon. The 
allowable options are: 


center — center the table (default is left-adjust); 

expand — make the table as wide as the current line length; 
box — enclose the table in a box; 

allbox — enclose each item in the table in a box; 
doublebox — enclose the table in two boxes, 

tab (x) — use x instead of tab to separate data items. 


linesize (1) — set lines or rules (e.g. from box) in 7 point type; 
delim (xy) — recognize x and y as the eqn delimiters. 


The 7b/ program tries to keep boxed tables on one page by issuing appropriate ‘tneed”’ 
(.n2e) commands. These requests are calculated from the number of lines in the tables, 
and if there are spacing commands embedded in the input, these requests may be inaccu- 
rate; use normal troff procedures, such as keep-release macros, in that case. The user who 
must have a multi-page boxed table should use macros designed for this purpose, as 
explained below under ‘Usage.’ 


FORMAT. The format section of the table specifies the layout of the columns. Each line 
in this section corresponds to one line of the table (except that the last line corresponds to 
all following lines up to the next .T&, if any — see below), and each line contains a key- 
letter for each column of the table. It is good practice to separate the key letters for each 
column by spaces or tabs. Each key-letter is one of the following: 


Lor 1 __ to indicate a left-adjusted column entry; 
Ror r_ to indicate a right-adjusted column entry, 
Core to indicate a centered column entry; 


Norn_ to indicate a numerical column entry, to be aligned with other numerical 
entries so that the units digits of numbers line up; 


Aor a2_ to indicate an alphabetic subcolumn; all corresponding entries are aligned on 
the left, and positioned so that the widest is centered within the column (see 
example on page 12); 


Sor s_ to indicate a spanned heading, i.e. to indicate that the entry from the previous 
column continues across this column (not allowed for the first column, obvi- 
ously); or 


a to indicate a vertically spanned heading, i.e. to indicate that the entry from the 
previous row continues down through this row. (Not allowed for the first row 
of the table, obviously). 


When numerical alignment is specified, a location for the decimal point is sought. The 
rightmost dot (.) adjacent to a digit is used as a decimal point: if there is no dot adjoining 
a digit, the rightmost digit is used as a units digit; if no alignment is indicated, the item is 
centered in the column. However, the special non-printing character string \& may be 
used to override unconditionally dots and digits, or to align alphabetic data; this string 
lines up where a dot normally would, and then disappears from the final output. In the 
example below, the items shown at the left will be aligned (in a numerical column) as 
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shown on the right: 


13 13 

4.2 4.2 
26.4.12 26.4.12 
abe abe 
abc\& abe 
43\&3.22 433.22 
749.12 749.12 


Note: If numerical data are used in the same column with wider L or r type table entries, 
the widest number is centered relative to the wider L or r items (L is used instead of | for 
readability; they have the same meaning as key-letters). Alignment within the numerical 
items is preserved. This is similar to the behavior of a type data, as explained above. 
However, alphabetic subcolumns (requested by the a key-letter) are always slightly 
indented relative to L items; if necessary, the column width is increased to force this. 
This is not true for n type entries. 


Warning: the n and a items should not be used in the same column. 


For readability, the key-letters describing each column should be separated by spaces. 
The end of the format section is indicated by a period. The layout of the key-letters in 
the format section resembles the layout of the actual data in the table. Thus a simple for- 
mat might appear as: 

css 

Inn. 
which specifies a table of three columns. The first line of the table contains a heading cen- 
tered across ail three columns; each remaining line contains a left-adjusted item in the 
first column followed by two columns of numerical data. A sample table in this format 
might be: 


Overall title 
Item-a 34.220 9.1 
Item-b 12.65 .02 
Items: ¢,d.e 23 5.8 
Total 69.87 14.92 


There are some additional features of the key-letter system: 


Horizontal lines — A key-letter may be replaced by ‘_’ (underscore) to indicate a hor- 
izontal line in place of the corresponding column entry, or by °=" to indicate a dou- 
ble horizontal line. If an adjacent column contains a horizontal line, or if there are 
vertical lines adjoining this column, this horizontal line is extended to meet the 
nearby lines. If any data entry is provided for this column, it is ignored and a warn- 
ing message is printed. 

Vertical lines — A vertical bar may be placed between column key-letters. This will 
cause a vertical line between the corresponding columns of the table. A vertical bar 
to the left of the first key-letter or to the right of the last one produces a line at the 
edge of the table. If two vertical bars appear between key-letters, a double vertical 
line is drawn. 

Space berween columns — A number may follow the key-letter. This indicates the 
amount of separation between this column and the next column. The number nor- 
mally specifies the separation in ens (one en is about the width of the letter ‘n’).* If 
the ‘‘expand” option is used, then these numbers are multiplied by a constant such 
that the table is as wide as the current line length. The default column separation 





* More precisety, an en is a number of points (1 point = 1/72 inch) equal to half the current type size. 
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number is 3. If the separation is changed the worst case (largest space requested) 
governs. 


Vertical spanning — Normally, vertically spanned items extending over several rows of 
the table are centered in their vertical range. If a key-letter is followed by t or T, 
any corresponding vertically spanned item will begin at the top line of its range. 


Fone changes — A key-letter may be followed by a string containing a font name or 
number preceded by the letter for F. This indicates that the corresponding column 
should be in a different font from the default font (usually Roman). All font names 
are one or two letters; a one-letter font name should be separated from whatever 
follows by a space or tab. The single letters B, b, I, and { are shorter synonyms for 
fB and fI. Font change commands given with the table entries override these 
specifications. 


Point size changes — A key-letter may be followed by the letter p or P and a number to 
indicate the point size of the corresponding table entries. The number may be a 
signed digit, in which case it is taken as an increment or decrement from the current 
point size. If both a point size and a column separation value are given, one or 
more blanks must separate them. 


Vertical spacing changes — A key-letter may be followed by the letter v or V and a 
number to indicate the vertical line spacing to be used within a multi-line 
corresponding table entry. The number may be a signed digit, in which case it is 
taken as an increment or decrement from the current vertical spacing. A column 
separation value must be separated by blanks or some other specification from a 
vertical spacing request. This request has no effect unless the corresponding table 
entry is a text block (see below). 


Column width indication ~— A key-letter may be followed by the letter w or W and a width 
value in parentheses. This width is used as a minimum column width. If the largest 
element in the column is not as wide as the width value given after the w, the larg- 
est element is assumed to be that wide. If the largest element in the column is 
wider than the specified value, its width is used. The width is also used as a default 
line length for included text blocks. Normal sroff units can be used to scale the 
width value; if none are used, the default is ens. If the width specification is a unit- 
less integer the parentheses may be omitted. If the width value is changed in a 
column, the /ast one given controls. 


Equal width columns — A key-letter may be followed by the letter e or E to indicate 
equal width columns. All columns whose key-letters are followed by e or E are 
made the same width. This permits the user to get a group of regularly spaced 
columns. 


Note: The order of the above features is immaterial; they need not be separated by 
spaces, except as indicated above to avoid ambiguities involving point size and font 
changes. Thus a numerical column entry in italic font and 12 point type with a 
minimum width of 2.5 inches and separated by 6 ens from the next column could be 
specified as 

npl2w(2.Si)fI 6 


Alternative notation — Instead of listing the format of successive lines of a table on con- 
secutive lines of the format section, successive line formats may be given on the 
same line, separated by commas, so that the format for the example above might 
have been written: 

ecss,inn. 


Default — Column descriptors missing from the end of a format line are assumed to be 
L. The longest line in the format section, however, defines the number of columns 
in the table; extra columns in the data are ignored silently. 


3) 
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DaTa. The data for the table are typed after the format. Normally, each table line is 
typed as one line of data. Very long input lines can be broken: any line whose last charac- 
ter is \ is combined with the following line (and the \ vanishes). The data for different 
columns (the table entries) are separated by tabs, or by whatever character has been 
specified in the option fabs option. There are a few special cases: 


Troff commands within tables ~ An input line beginning with a ‘*.” followed by anything 
but a number is assumed to be a command to troffand is passed through unchanged, 
retaining its position in the table. So, for example, space within a table may be pro- 
duced by °*.sp’’ commands in the data. 


Full width horizontal lines — An input line containing only the character _ (underscore) 


or = (equal sign) is taken to be a single or double line, respectively, extending the 
full width of the /aédle. 


Single colurtn horizontal lines ~ An input table entry containing only the character _ or = 
is taken to be a single or double line extending the full width of the column. Such 
lines are extended to meet horizontal or vertical lines adjoining this column. To 
obtain these characters explicitly in a column, either precede them by \& or follow 
them by a space before the usual tab or newline. 


Shorr horizontal lines — An input table entry containing only the string \_ is taken to be a 
single line as wide as the contents of the column. It is not extended to meet adjoin- 
ing lines. 

Repeated characters — An input table entry containing only a string of the form \Rx 
where x is any character is replaced by repetitions of the character x as wide as the 


data in the column. The sequence of x’s is not extended to mest adjoining 
‘columns. 


Vertically spanned items — An input table entry containing only the character string \* 
indicates that the table entry immediately above spans downward over this row. It is 
equivalent to a table format key-letrer of °°. 


Text blocks — In order to include a block of text as a table entry, precede it by T{ and 
follow it by T}. Thus the sequence 


block of 
text 
T)} eee 


is the way to enter, as a single entry in the table, something that cannot con- 
veniently be typed as 4 simple string between tabs. Note that the T} end delimiter 
must begin a line. additional columns of data may follow after a tab on the same 
line. See the example on page 10 for an illustration of included text blocks in a 
table. If more than twenty or thirty text blocks are used in a table, various limits in 
the ‘roff program are likely to be exceeded, producing diagnostics such as ‘too many 
string/macro names’ or ‘too many number registers.’ 


Text blocks are pulled out from the table, processed separately by rrogi and replaced 
in the table as a solid block. If no line length is specified in the block of rexr itself, 
or in the table format, the default is to use LxC/(N+1) where L is the current line 
length, C is the number of table columns spanned by the text. and V is the total 
number of columns in the table. The other parameters (point size, font. etc.) used 
in setting the block of text are those in effect at the beginning of the table (including 
the effect of the ‘*.TS** macro) and any table format specifications of size. spacing 
and font, using the p, v and f modifiers to the column key-letters. Commands 
within the text block itself are also recognized, of course. However, trod’ commands 
within the table data but not within the text block do not affect that block. 
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Warnings: — Although any number of lines may be present in a table, only the first 200 
lines are used in calculating the widths of the various columns. A multi-page table, 
of course, may be arranged as several single-page tables if this proves to be a prob- 
lem. Other difficulties with formatting may arise because, in the calculation of 
column widths all table entries are assumed to be in the font and size being used 
when the **.TS’’ command was encountered, except for font and size changes indi- 
cated (a) in the table format section and (b) within the table data (as in the entry 
\s+3\fldata\fP\sO0). Therefore, although arbitrary troffrequests may be sprinkled in 
a table, care must be taken to avoid confusing the width calculations; use requests 
such as ‘.ps’ with care. 


4) ADDITIONAL COMMAND LINES. If the format of a table must be changed after many simi- 
lar lines, as with sub-headings or summarizations, the ‘‘.T&’’ (table continue) command 
can be used to change column parameters. The outline of such a table input is: 


-TS 
options ; 
format . 
data 
-T& 
format . 
data 
T& 
format . 
data 


as in the examples on pages 10 and 12. Using this procedure, each table line can be close 
to its corresponding format line. 


Warning: it is not possible to change the number of columns, the space between columns, 
the global options such as box, or the selection of columns to be made equal width. 


Usage. 
On UNIX, /6/ can be run on a simple table with the command 


tbl input-file | troff 


but for more complicated use, where there are several input files, and they contain equations 
and ms memorandum layout commands as well as tables, the normal command would be 


tbl file-1 file-2 . . .|eqn|troff —ms 


and, of course, the usual options may be used on the troffand eqn commands. The usage for 
nroff is similar to that for troff, but only TELETYPE® Model 37 and Diablo-mechanism (DasSI or 
Gs!) terminals can print boxed tables directly. 


For the convenience of users employing line printers without adequate driving tables or 
post-filters, there is a special —7X command line option to 1b/ which produces output that does 
not have fractional line motions in it. The only other command line options recognized by 7d/ 
are —ms and —mm which are tured into commands to fetch the corresponding macro files, 
usually it is more convenient to place these arguments on the troff part of the command line, 
but they are accepted by /d/ ag weil. 


Note that when eqn and /d/ are used together on the same file rb/ should be used first. If 
there are no equations within tables, either order works, but it is usually faster to run /d/ first, 
since eqn normally produces a larger expansion of the input than /o/. However, if there are 
equations within tables (using the delim mechanism in eqn), (6! must be first or the output will 
be scrambled. Users must also beware of using equations in n-style columns; this is nearly 
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always wrong, since /d/ attempts to split numerical format items into two parts and this is not 
possible with equations. The user can defend against this by giving the delum(xx) table option: 
this prevents splitting of numerical columns within the delimiters. For example, if the eqn del- 
imiters are SS, giving delim(SS) a numerical column such as “1245 S+- 16S” will be divided 
after 1245, not after 16. 


Tol limits tables to twenty columns: however, use of more than 16 numerical columns 
may fail because of limits in trof{ producing the ‘too many number registers’ message. Troff 
number registers used by /b/ must be avoided by the user within tables: these include two-digit 
names from 31 to 99, and names of the forms #x, x+, xl, <x, and x, where x is any lower 
case letter. The names ##, #-—, and #° are also used in certain circumstances. To conserve 
number register names, the n and a formats share a register, hence the restriction above that 
they may not be used in the same column. 


For aid in writing layout macros, /b/ defines a number register TW which is the table 
width: it is defined by the time that the ‘“*.TE" macro is invoked and may be used in the 
expansion of that macro. More importantly, to assist in laying out multi-page boxed tables the 
macro T# is defined to produce the bottom lines and side lines of a boxed table, and then 
invoked at its end. By use of this macro in the page footer a multi-page table can be boxed. In 
particular, the ms macros can be used to print a multi-page boxed table with a repeated heading 
by giving the argument H to the **.TS*’ macro. If the table start macro is written 
a line of the form 

-TH 
must be given in the table after any table heading (or at the start if none). Material up to the 
**.TH™ is placed at the top of each page of table; the remaining lines in the table are placed on 
several pages as required. Note that this is ora feature of rb/, but of the ars layout macros. 


Examples. 


Here are some examples illustrating features of (bi. The symbol @ in the input 
represents a tab character. 





Input: Output: 
-TS Language Authors Runs on 
box; 
eee Fortran Many Almost anything 
ttt. PL/I IBM 360/370 
Language © Authors @ Runs on Cc BTL 11/45,H6000,370 

BLISS Carnegie-Mellon POP-10,11 

Fortran @ Many @ Almost anything IDS Honeywell H6000 
PL/1 @IBM © 360/370 Pascal Stanford 370 
CO@BTL @11/45,H6000,370 
BLISS @ Carnegie-Mellon @ PDP-10,11 
IDS @ Honeywell @ H6000 
Pascal © Stanford ®370 


-TE 
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Input: 


.TS 

allbox: 

css 

ccc 

nnn. 

AT&T Common Stock 
Year @ Price @ Dividend 
1971 @41-54@$2.60 
2041-54@2.70 

3@ 46-55 @ 2.87 

4@ 40-53 @3.24 

5 @45-52@3.40 
6@51-59@ .95* 

-TE 

* (first quarter only) 


Input: 


-TS 

box; 

css 

elelc 

I[t]n. 

Major New York Bridges 


Bridge @ Designer © Length 


Brooklyn @J. A. Roebling @ 1595 
Manhattan @G. Lindenthal @ 1470 
Williamsburg @L. L. Buck @ 1600 


Queensborough @ Palmer &@1182 
@ Hornboste! 


@ ©1380 
Triborough OO. H. Ammann @_ 
@ @383 


Bronx Whitestone ®O. H. Ammann @ 2300 
Throgs Neck @O. H. Ammann @ 1800 


Output: 


Year | Price | Dividend | 
ist | 41-54 | $2.60_| 
2 [aise [270 
[4 [40-53 [3.24 | 

45-52 | 3.40 


* (first quarter only) 















Output: 









Major New York Bridges 


| CiBridge | Designer | Length | 
Brooklyn J. A. Roebling 1595 
Manhattan G. Lindenthal 1470 
Williamsburg L. L. Buck 1600 


Queensborough Palmer & 
Hornboste! 


| 1380 | 







Triborough O. H. Ammann 






ae 
Bronx Whitestone O. H. Ammann 2300 
Throgs Neck O. H. Ammann 1800 
0. H. Ammann | 3500 









George Washington @O. H. Ammann @3500 


TE 
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Input: Output: 


-1S Stack 
cc 
np-2[|n{. 
@ Stack 


o. 
1046 


meerh wN = 


O_ 
2023 
O_ 
3015 
@_ 
406.5 
@_ 
§$@2.1 
O_ 
TE 
Input: Output: 
TS j february march 


box; i may 
LLL j july 


september 
LL fi LB november december 





january @ february © march 

april @ may 

june @july © Months 

august @september 

october @ november Odecember 
TE 
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Input: Output: 
TS Composition of Foods 
box: 
as Food Carbo- 
Composition of Foods | Protein | Fat hydrate 
"T& Apples 4 ; ‘ 
cless Halibut 
eless Lima beans 
ec le fe le. a 
Food @ Percent by Weight ushrooms 
\"@ Rye bread 
\" @ Protein @ Fat @ Carbo- 
\"@\" @\" @hydrate 
.T& 
I {[n|n [n. 
Apples @ .4@ .5@13.0 


Halibut@18.4@5.2@... 
Lima beans @7.5@.8@22.0 
















Milk @3.304.0@5.0 
Mushrooms @3.5@ .4@6.0 
Rye bread @9.0@ .6@52.7 
-TE' 
Input: Output: 
TS New York Area Rocks 
nes | Era_—s|_~——Formation ~—|_— Age (years) | 
Seen [Precambrian_| Reading Prong | >t bilion | 
'p9 1p9 Ip9.. 





New York Area Rocks 

Era @ Formation @ Age (years) 
Precambrian @ Reading Prong @ >1 billion 
Paleozoic ® Manhattan Prong @400 million 


Newark Basin, 
incl. Stockton, 
Lockatong, and 
Brunswick for- 
















: mations: also 
Mesozoic @T{ Watchungs and 
ns pceeeng! Palisades. 
ewar in, incl. : EAT 
Stockton: Lockatong: and Brinswick Cenozoic Coastal Plain On Long Island 


30,000 years; 
Cretaceous sedi- 
ments redepo- 
sited by recent 
glaciation. 


formations; also Watchungs 

and Palisades. 

T}@200 million 

Cenozoic @ Coastal Plain @T{ 

On Long Island 30,000 years: 
Cretaceous sediments redeposited 
by recent glaciation. 

.ad 

T} 

TE 
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Input: Output: 
-EQ Definition 
delim SS 
-EN BO daeere 
sin(x)=>-(e*—e~*) 
2 22,2 
.TS erf(s) = f. enw de 
doublebox: vn 
cc Iglz=— f cos(ssing) dé | 
tt. 3 | 
Name @ Definition C(s)= Si k7* (Re s>1) 
“Sp 
.vs +2p 


Gamma@SGAMMA (z) = int sub 0 sup inf t sup (z-L} e sup -t deS 

Sine @Ssin (x) = 1 over 2i ( e sup ix - e sup -ix )S 

Error @$ roman erf (z) = 2 over sqrt pi int sub 0 sup z e sup (-t sup 2] dt$ 
Bessel @S$ J sub 0 (z) = 1 over pi int sub 0 sup pi cos ( z sin theta ) d theta $ 
Zeta@S§ zeta (s) = sum from k=1 to inf k sup -s ~( Re“s > 1)S 










VS -2p 
-TE 

Input: Output: 
TS Readability of Text 
box, tab(:); Line Width und Leading for 10-Point Tyre 
cbssss Line 1-Point | 2-Point | 4-Point 
m2 5555 | wiath | soa | Lessing | Leacing | Leasing 
cllelelele 
eliclclele 


r2 [n2]n2]n2[n. 

Readability of Text 

Line Width and Leading for 10-Point Type 
Line: Set: 1-Point : 2-Point : 4-Point 

Width : Solid : Leading : Leading : Leading 


9 Pica :\-9.3:\-6.0:\-5.3:\-7.1 
14 Pica :\-4.5:\-0.6:\-0.3:\-1.7 
19 Pica :\-5.0:\-§.1: 0.0:\-2.0 
31 Pica:\-3.7:\-3.8:\-2.4:\-3.6 
43 Pica:\-9.1:\-9.0:\-5.9:\-8.8 
TE 
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Railway motor cars 2,905 
Railway trailer cars @ 1,269 
Total railway © 4,174 
Omnibuses @ 8,347 

T& 

In 

an. 

sp.5 | 

Staff @ 73,739 
Administrative, etc. ®5,582 
Civil engineering ® 5,134 
Electrical eng. ® 1,714 
Mech. eng. \- railway © 4,310 
Mech. eng. \- road@9,152 
Railway operations @ 8,930 
Road operations @ 35,946 
Other @ 2,971 

-TE 


Input: Output: 
-TS Some London Transport Statistics 
cs (Year 1964) 
cip-2 s | Railway route miles 244 
In Tube 66 
an. Sub-surface 22 
Some London Transport Statistics Surface 156 
(Year 1964) re 
=o Sa miles @ 244 a railway 674 million 
ea ees @22 Average length 4.55 miles 
Surface @ 156 Passenger miles 3,066 million 
aus Passenger traffic — road 
a. Journeys 2.252 million 
I ; Average length 2.26 miles 
— Passenger miles 5,094 million 
Passenger traffic \- railway Vehicles 12,521 
Journeys ®@674 million Railway motor cars 2.905 
Average length 4.55 miles Railway trailer cars 1,269 
Passenger miles @ 3,066 million Total railway 4.174 
T& Omnibuses 8,347 
os Staff 73,739 
Passenger traffic \- road Administrative, etc. 5,582 
ret Civil engineering 5,134 
Journeys @ 2,252 million Plectrieal en 1714 
; g. ‘ 
Average length ®2.26 miles Sex. 
Passenger miles @ 5,094 million Mech. eng. = fatlway 4.310 
T& : Mech. eng. — road 9,152 
' 3 Railway operations 8,930 
an Road operations 35,946 
i Other 2,971 
sp .5 
Vehicles @ 12,521 
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Input: 


-ps 8 

-vs 10p 

TS 

center box: 

ess 

ciss 

ece 

IBIn. 

New Jersey Representatives 

(Democrats) 

sp .5 

Name @ Office address ® Phone 

Sp .5 

James J. Fiorio@23 S. White Horse Pike, Somerdale 08083 @ 609-627-8222 
William J. Hughes @2920 Atlantic Ave., Atlantic City 08401 @ 609-345-4844 
James J. Howard @801 Bangs Ave., Asbury Park 07712 @ 201-774-1600 
Frank Thompson, Jr. @10 Rutgers Pl., Trenton 08618 @ 609-599-1619 
Andrew Maguire@ 115 W. Passaic St., Rochelle Park 07662 @ 201-843-0240 
Robert A. Roe @U.S.P.O., 194 Ward St., Paterson 07510 @ 201-523-5152 
Henry Helstoski@666 Paterson Ave., East Rutherford 07073 @ 201-939-9090 
Peter W. Rodino, Jr. @Suite 1435A, 970 Broad St., Newark 07102 @ 201-645-3213 
Joseph G. Minish @308 Main St., Orange 07050 © 201-645-6363 

Helen S. Meyner@32 Bridge St., Lambertville 08530 @ 609-397-1830 
Dominick V. Daniels @895 Bergen Ave., Jersey City 07306 © 201-659-7700 
Edward J. Patten @Natl. Bank Bldg., Perth Amboy 08861 @ 201-826-4610 
sp .5 

T& 

ciss 

IBIn. 

(Republicans) 

sp .SVv 

Millicent Fenwick @41 N. Bridge St., Somerville 08876 ® 201-722-8200 
Edwin B. Forsythe @301 Mill St., Moorestown 08057 @ 609-235-6622 
Matthew J. Rinaldo @ 1961 Morris Ave., Union 07083 @ 201-687-4235 

LE 

eps 10 

-vs 12p 
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Output: 


Name 


James J. Florio 
William J. Hughes 
James J. Howard 


Frank Thompson, Jr. 


Andrew Maguire 
Robert A. Roe 
Henry Helstoski 
Peter W. Rodino, Jr. 
Joseph G. Minish 
Helen S. Meyner 
Dominick V. Daniels 
Edward J. Patten 


Millicent Fenwick 
Edwin B. Forsythe 
Matthew J. Rinaldo 


New Jersey Representatives 
(Democrats) 


Office address 


23 S. White Horse Pike. Somerdale 08083 
2920 Atlantic Ave., Atlantic City 08401 
801 Bangs Ave., Asbury Park 07712 

10 Rutgers Pt., Trenton 08618 

115 W. Passaic St.. Rochelle Park 07662 
U.S.P.0., 194 Ward St., Paterson 07510 
666 Paterson Ave., East Rutherford 07073 
Suite 1435A, 970 Broad St., Newark 07102 
308 Main St., Orange 07050 

32 Bridge St.. Lambertville 08530 

895 Bergen Ave., Jersey City 07306 

Natl. Bank Bidg., Perth Amboy 08361 


(Republicans) 


41 N. Bridge St., Somerville 08876 
301 Milt St.. Moorestown 08057 
1961 Morris Ave., Union 07083 


Phone 


609-627-8222 
609-345-4844 
201-774-1600 
609-599-1619 
201-843-0240 
201-523-5152 
201-939-9090 
201-645-3213 
201-645-6363 
609-397-1830 
201-659-7700 
201-826-4610 


201-722-8200 
609-235-6622 
201-687-4235 





This is a paragraph of normal text placed here only to indicate where the left and right margins 
are. In this way the reader can judge the appearance of centered tables or expanded tables, and 
observe how such tables are formatted. 


Input: 


Ae 
expand; 
csss 
eccc 
finn. 


Bell Labs Locations 
Name @ Address @ Area Code @ Phone 

Holmdel! ® Holmdel, N. J. 077330201 © 949-3000 
Murray Hill ® Murray Hill, N. J. 07974 @ 201 © 582-6377 
Whippany @ Whippany, N. J. 07981 @201 @ 386-3000 
Indian Hill @ Naperville, Illinois 60540 @ 312 @ 690-2000 


-TE 


Output: 


Name 
Holmdel 
Murray Hill 
Whippany 
Indian Hill 


Bell Labs Locations 


Address Area Code 
Holmdel, N. J. 07733 201 
Murray Hill, N. J. 07974 201 
Whippany, N. J. 07981 201 
Naperville, Illinois 60540 312 


Phone 
949-3000 
$82-6377 
386-3000 
690-2000 
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Input: 
TS 
box: 
eb s $s $ 
elele s 


Itiw(1i) | lew(2i) | 198 | w(t .6i)ps. 
Some Interesting Ptaces 


Name@® Description @ Practical Information 

Tl 

American Museum of Natural History 

TI@T{ 

The collections fil! 11.5 acres (Michelin) or 25 acres (MTA) 

of exhibition hails on four floors. There is a full-sized replica 

of a biue whale and the world’s largest star sapphire (stolen in 1964). 
TI® Hours@ 10-5, ex. Sun 11-5, Wed. to 9 


\VO\"@ Location @ T{ 
— Pack West & 7%h St. 


VOVO Admission @ Donation: $1.00 asked 
\.O\"Osubway® AA to Sist St. 
V@\"O Telephone @ 212-873-4225 


Bronx Zoo@T{ 

About a mile long and .6 mile wide, this is the largest zoo in America. 
A lion eats 18 pounds 

of meat a day while a sea lion ests 1$ pounds of fish. 

TID Hourss@T| 

10-4:30 winter, to 5:00 summer 


Ti 
VOV@ Location @T{ 
185th Sc. & Southern Blvd. the Bronx. 


T 

OVO Admission © $1.00, but Tu.We.Th free 
\'O\" @Subway® 2, $ to Eust Tremont Ave. 
VO\"@ Teiephone@ 212-933-1759 


Brooklyn Museum@T{ 

Five floors of gulleries contain American and ancient art. 

There are American period rooms and architectural ornaments saved 
from weeckers, such as a classical figure from Pennsylvania Station. 
TI@ Hours@ Wed-Sut. 10-5, Sun 12-5 

\VO\"O@ Location @T{ 

ie Parkway & Washington Ave., Brooklyn. 


T 

\VO\"@ Admission@ Free 

\'@\" Osubway@ 2,3 to Eastern Parkway. 
\VO\"O Telephone@ 212-638-5000 


T( 

New-York Historical Saciety 

TIOT( 

All the original paintings for Audubon’s 


el 

Birds of America 

are here, as are exhibits of American decorative arts, New York history, 
Hudson River schooi paintings. carriages, and glass paperweights. 

T!@ Hours@ T{ 

Tues-Fri & Sun, 1-5; Sat 10-5 


T} 

VON" @ Location @t{ 
al Park West & 77th Se. 
T 


VO\'O Admission @ Free 
VOVO Subway@ AA to 8lst St. 
\ os @ Telephone @ 212-873-3400 
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Output: 


Some Interesting Places 


Practical Information 


American Muse- | The collections fill 11.5 acres | Hours 10-5, ex. Sun 11-5, Wed. to 9 
um of Natural | (Michelin) or 25 acres (MTA) | Location Central Park West & 7%h St. 
History of exhibition halls on four | Admission | Donation: $1.00 asked 
floors. There is a full-sized re- | Subway AA to 8tst St. 













































plica of a blue whale and the | Telephone | 212-873-4225 
world’s largest star sapphire 
(stolen in 1964). 
Bronx Zoo About a mile long and .6 mile | Hours 10-4:30 winter, to $:00 summer 
wide, this is the largest zoo in | Location 185th St. & Southern Bivd, the 
America. A_ lion eats 18 Bronx. 
pounds of meat a day while a | Admission | $1.00. but Tu,We,Th free 
sea lion eats 15 pounds of fish. Subway 2. 5 to East Tremont Ave. 


Telephone | 212-933-1759 


Wed-Sut, 10-5, Sun 12-5 
Eastern Parkway & Washington 
Ave., Brooklyn. 

Free 

2.3 to Eastern Parkway. 
212-638-5000 














Five floors of galleries contain 
American and = ancient art. 
There are American period 
rooms and architectural orna- 
ments saved from wreckers, 
such as a classical figure from 
Pennsylvania Station. 


All the original paintings for 


Brooklyn Museum 















Hours 
Location 













Admission 
Subway 
Telephone 























New-York Histor- Hours Tues-Fri & Sun, 1-5, Sat 10-5 


ical Society Audubon’'s Birds of America are | Location Central Park West & 77th St. 
here, as are exhibits of Ameri- | Admission | Free 
can decorative arts, New York | Subway AA to 8ist St. 





history, Hudson River school 
paintings, carriages, and glass 
paperweights. 


Telephone 212-873-3400 
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List of Tb! Command Characters and Words 


Command 
aA 
allbox 
bB 

box 

eC 
center 
doublebox 


Meaning 
Alphabetic subcolumn 
Draw box around all items 
Boldface item 
Draw box around table 
Centered column 
Center table in page 
Doubled box around table 
Equal width columns 
Make table full line width 
Font change 
Italic item 
Left adjusted column 
Numerical column 
Column separation 
Point size change 
Right adjusted column 
Spanned item 
Vertical spanning at top 
Change data separator character 
Text block 
Vertical spacing change 
Minimum width value 
Included sroff command 
Vertical line 
Double vertical line 
Vertical span 
Vertical span 
Double horizontal line 
Horizontal line 
Short horizontal line 
Repeat character 


Section 


NN 


@ 


Www Ww Ww Ww NNN WN Nw NNN NNN NNN ENR em Ne Ne 


@ 
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Refer — A Bibliography System 
Bill Tuthill 


Computing Services 
University of California 
Berkeley, CA 94720 


Introduction 


Taken together, the refer programs constitute a database system for use with variable-length 
information. To distinguish various types of bibliographic material, the system uses labels composed 
of upper case letters, preceded by a percent sign and followed by a space. For example, one document 
might be given this entry: 


%A Joel Kies 

%T Document Formatting on Unix Using the -ms Macros 
%1I Computing Services 

%C Berkeley 

%D 1980 


Each line is called a field, and lines grouped together are called a record; records are separated from 
each other by a blank line. Bibliographic information follows the labels, containing data to be used by 
the refer system. The order of fields is not important, except that authors should be entered in the 
same order as they are listed on the document. Fields can be as long as necessary, and may even be 
continued on the following line(s). 


The labels are meaningful to nroff/troff macros, and, with a few exceptions, the refer program 
itself does not pay attention to them. This implies that you can change the label codes, if you also 
change the macros used by nroff/troff .. The macro package takes care of details like proper order- 
ing, underlining the book title or journal name, and quoting the article’s title. Here are the labels 
used by refer, with an indication of what they represent: 
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%H Header commentary, printed before reference 
%A Author’s name 

%Q Corporate or foreign author (unreversed) 

%T Title of article or book 

%S_ Series title 

%J Journal containing article 

%B Book containing article 

%R Report, paper, or thesis (for unpublished material) 
%V Volume 

%N Number within volume 

%E Editor of book containing article 

%P Page number(s) 

%I1 Issuer (publisher) 

%C City where published 

%D Date of publication 

%O Other commentary, printed at end of reference 
%K Keywords used to locate reference 

%\L Label used by —k option of refer 

%X Abstract (used by roffbib, not by refer) 


Only relevant fields should be supplied. Except for %A, each field should be given only once; in the 
case of multiple authors, the senior author should come first. The %Q is for organizational authors, 
or authors with Japanese or Arabic names, in which cases the order of names should be preserved. 
Books should be labeled with the %T, not with the %B, which is reserved for books containing arti- 
cles. The %J and %B fields should never appear together, although if they do, the %J will override 
the %B. If there is no author, just an editor, it is best to type the editor in the %A field, as in this 
example: 


% A Bertrand Bronson, ed. 


The %E field is used for the editor of a book (%B) containing an article, which has its own author. 
For unpublished material such as theses, use the %R field; the title in the %T field will be quoted, 
but the contents of the %R field will not be underlined. Unlike other fields, %H, %O, and %X 
should contain their own punctuation. Here is a modest example: 


%A Mike E. Lesk 

%'T Some Applications of Inverted Indexes on the Unix System 
%B Unix Programmer’s Manual 

%1 Bell Laboratories 

%C Murray Hill, NJ 

%D 1978 

%V 2a 

%K refer mkey inv hunt 

%X Difficult to read paper that dwells on indexing strategies, 
giving little practical advice about using fBreferP. 


Note that the author’s name is given in normal order, without inverting the surname; inversion is 
done automatically, except when %Q is used instead of %A. We use %X rather than %O for the 
commentary because we do not want the comment printed all the time. The %O and %H fields are 
printed by both refer and roffbib; the %X field is printed only by roffbib, as a detached annota- 
tion paragraph. 


Data Entry with Addbib 


The addbib program is for creating and extending bibliographic databases. You must give it 
the filename of your bibliography: 
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% addbib database 


Every time you enter addbib, it asks if you want instructions. To get them, type y; to skip them, 
type RETURN. Addbib prompts for various fields, reads from the keyboard, and writes records con- 
taining the refer codes to the database. After finishing a field entry, you should end it by typing 
RETURN. If a field is too long to fit on a line, type a backslash (\) at the end of the line, and you will 
be able to continue on the following line. Note: the backslash works in this capacity only inside add- 
bib. 


A field will not be written to the database if nothing is entered into it. Typing a minus sign as 
the first character of any field will cause addbib to back up one field at a time. Backing up is the 
best way to add multiple authors, and it really helps if you forget to add something important. Fields 
not contained in the prompting skeleton may be entered by typing a backslash as the last character 
before RETURN. The following line will be sent verbatim to the database and addbib will resume 
with the next field. This is identical to the procedure for dealing with long fields, but with new fields, 
don’t forget the % key-letter. 


Finally, you will be asked for an abstract (or annotation), which will be preserved as the %X 
field. Type in as many lines as you need, and end with a control-D (hold down the CTRL button, then 
press the “d” key). This prompting for an abstract can be suppressed with the —a command line 
option. 


After one bibliographic record has been completed, addbib will ask if you want to continue. If 
you do, type RETURN; to quit, type q or n (quit or no). It is also possible to use one of the system 
editors to correct mistakes made while entering data. After the “Continue?” prompt, type any of the 
following: edit, ex, vi, or ed — you will be placed inside the corresponding editor, and returned to 
addbib afterwards, from where you can either quit or add more data. 


If the prompts normally supplied by addbib are not enough, are in the wrong order, or are too 
numerous, you can redefine the skeleton by constructing a promptfile. Create some file, to be named 
after the —p command line option. Place the prompts you want on the left side, followed by a single 
TAB (control-I), then the refer code that is to appear in the bibliographic database. Addbib will 
send the left side to the screen, and the right side, along with data entered, to the database. 


Printing the Bibliography 


Sortbib is for sorting the bibliography by author (%A) and date (%D), or by data in other 
fields. It is quite useful for producing bibliographies and annotated bibliographies, which are seldom 
entered in strict alphabetical order. It takes as arguments the names of up to 16 bibliography files, 
and sends the sorted records to standard output (the terminal screen), which may be redirected 
through a pipe or into a file. 


The -sKEYS flag to sortbib will sort by fields whose key-letters are in the KEYS string, 
rather than merely by author and date. Key-letters in KEYS may be followed by a ‘+’ to indicate 
that all such fields are to be used. The default is to sort by senior author and date (printing the 
senior author last name first), but —sA+D will sort by all authors and then date, and —sATD will sort 
on senior author, then title, and then date. 


Roffbib is for running off the (probably sorted) bibliography. It can handle annotated 
bibliographies — annotations are entered in the %X (abstract) field. Roffbib is a shell script that 
calls refer —B and nroff —mbib. It uses the macro definitions that reside in 
/usr/lib/tmac/tmac.bib, which you can redefine if you know nroff and troff. Note that refer will 
print the %H and %O commentaries, but will ignore abstracts in the %X field; roffbib will print 
both fields, unless annotations are suppressed with the —x option. 


The following command sequence will lineprint the entire bibliography, organized alphabetically 
by author and date: 
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% sortbib database | roffbib | lpr 


This is a good way to proofread the bibliography, or to produce a stand-alone bibliography at the end 
of a paper. Incidentally, roffbib accepts all flags used with nroff. For example: 


% sortbib database | roffbib —Tdtce —s1 


will make accent marks work on a DTC daisy-wheel printer, and stop at the bottom of every page for 
changing paper. The —n and —o flags may also be quite useful, to start page numbering at a selected 
point, or to produce only specific pages. 


Roffbib understands four command-line number registers, which are something like the two- 
letter number registers in —ms. The —rN1 argument will number references beginning at one (1); use 
another number to start somewhere besides one. The —rV2 flag will double-space the entire bibliogra- 
phy, while —rV1 will double-space the references, but single-space the annotation paragraphs. Finally, 
specifying —rL6i changes the line length from 6.5 inches to 6 inches, and saying —rO1i sets the page 
offset to one inch, instead of zero. (That’s a capital O after —r, not a zero.) 


Citing Papers with Refer 


The refer program normally copies input to output, except when it encounters an item of the 
form: 


[ 
partial citation 


] 


The partial citation may be just an author’s name and a date, or perhaps a title and a keyword, or 
maybe just a document number. Refer looks up the citation in the bibliographic database, and 
transforms it into a full, properly formatted reference. If the partial citation does not correctly iden- 
tify a single work (either finding nothing, or more than one reference), a diagnostic message is given. 
If nothing is found, it will say “No such paper.” If more than one reference is found, it will say “Too 
many hits.” Other diagnostic messages can be quite cryptic; if you are in doubt, use checknr to ver- 
ify that all your .[’s have matching .]’s. 


When everything goes well, the reference will be brought in from the database, numbered, and 
placed at the bottom of the page. This citation,! for example, was produced by: 


This citation, 


lesk inverted indexes 


] 


for example, was produced by 


The .[ and .] markers, in essence, replace the .FS and .FE of the —ms macros, and also provide a 
numbering mechanism. Footnote numbers will be bracketed on the the lineprinter, but superscripted 
on daisy-wheel terminals and in troff. In the reference itself, articles will be quoted, and books and 
journals will be underlined in nroff, and italicized in troff. 


Sometimes you need to cite a specific page number along with more general bibliographic 
material. You may have, for instance, a single document that you refer to several times, each time 
giving a different page citation. This is how you could get “p. 10” in the reference: 


[ 
kies document formatting 
%P 10 


| 


The first line, a partial citation, will find the reference in your bibliography. The second line will 
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insert the page number into the final citation. Ranges of pages may be specified as “%P 56-78”. 


When the time comes to run off a paper, you will need to have two files: the bibliographic data- 
base, and the paper to format. Use a command line something like one of these: 


% refer —p database paper | nroff —ms 
% refer —p database paper |tbilnroff —ms 
% refer —p database paper tbl | neqn | nroff —ms 


If other preprocessors are used, refer should precede tbl, which must in turn precede eqn or neqn. 
The —p option specifies a “private” database, which most bibliographies are. 


Refer’s Command-line Options 


Many people like to place references at the end of a chapter, rather than at the bottom of the 
page. The —e option will accumulate references until a macro sequence of the form 


| 
$LIST$ 


] 


is encountered (or until the end of file). Refer will then write out all references collected up to that 
point, collapsing identical references. Warning: there is a limit (currently 200) on the number of 
references that can be accumulated at one time. 


It is also possible to sort references that appear at the end of text. The —sKEYS flag will sort 
references by fields whose key-letters are in the KEYS string, and permute reference numbers in the 
text accordingly. It is unnecessary to use —e with it, since —s implies —e. Key-letters in KEYS may 
be followed by a ‘+’ to indicate that all such fields are to be used. The default is to sort by senior 
author and date, but —sA+D will sort on all authors and then date, and —sA+T will sort by authors 
and then title. 


Refer can also make citations in what is known as the Social or Natural Sciences format. 
Instead of numbering references, the —1 (letter ell) flag makes labels from the senior author’s last 
name and the year of publication. For example, a reference to the paper on Inverted Indexes cited 
above might appear as [Lesk1978a]. It is possible to control the number of characters in the last 
name, and the number of digits in the date. For instance, the command line argument —16,2 might 
produce a reference such as [Kernig78c]. 


Some bibliography standards shun both footnote numbers and labels composed of author and 
date, requiring some keyword to identify the reference. The —k flag indicates that, instead of 
numbering references, key labels specified on the % L line should be used to mark references. 


The —n flag means to not search the default reference file, located in /usr/dict/papers/Rv7man. 
Using this flag may make refer marginally faster. The —an flag will reverse the first n author names, 
printing Jones, J. A. instead of J. A. Jones. Often —al is enough; this will reverse the names of only 
the senior author. In some versions of refer there is also the —f flag to set the footnote number to 
some predetermined value; for example, —f23 would start numbering with footnote 23. 


Making an Index 


Once your database is large and relatively stable, it is a good idea to make an index to it, so that 
references can be found quickly and efficiently. The indxbib program makes an inverted index to 
the bibliographic database (this program is called pubindex in the Bell Labs manual). An inverted 
index could be compared to the thumb cuts of a dictionary — instead of going all the way through 
your bibliography, programs can move to the exact location where a citation is found. 


Indxbib itself takes a while to run, and you will need sufficient disk space to store the indexes. 
But once it has been run, access time will improve dramatically. Furthermore, large databases of 
several million characters can be indexed with no problem. The program is exceedingly simple to use: 
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% indxbib database 


Be aware that changing your database will require that you run indxbib over again. If you don’t, you 
may fail to find a reference that really is in the database. 


Once you have built an inverted index, you can use lookbib to find references in the database. 
Lookbib cannot be used until you have run indxbib. When editing a paper, lookbib is very useful 
to make sure that a citation can be found as specified. It takes one argument, the name of the 
bibliography, and then reads partial citations from the terminal, returning references that match, or 
nothing if none match. Its prompt is the greater-than sign. 


% lookbib database 

> lesk inverted indexes 

%A Mike E. Lesk 

%T Some Applications of Inverted Indexes on the Unix System 
%J Unix Programmer’s Manual 

%I Bell Laboratories 

%C Murray Hill, NJ 

%D 1978 

%V 2a 

%X Difficult to read paper that dwells on indexing strategies, 
giving little practical advice about using \fBrefer\fP. 

> 


If more than one reference comes back, you will have to give a more precise citation for refer. 
Experiment until you find something that works; remember that it is harmless to overspecify. To get 
out of the lookbib program, type a control-D alone on a line; lookbib then exits with an “EOT” 
message. 


Lookbib can also be used to extract groups of related citations. For example, to find all the 
papers by Brian Kernighan found in the system database, and send the output to a file, type: 


% lookbib /usr/dict/papers/Ind > kern.refs 
> kernighan 

> EOT 

% eat kern.refs 


Your file, “kern.refs”, will be full of references. A similar procedure can be used to pull out all papers 
of some date, all papers from a given journal, all papers containing a certain group of keywords, etc. 


Refer Bugs and Some Solutions 


The refer program will mess up if there are blanks at the end of lines, especially the %A 
author line. Addbib carefully removes trailing blanks, but they may creep in again during editing. 
Use an editor command — g/ *$/s/// — to remove trailing blanks from your bibliography. 


Having bibliographic fields passed through as string definitions implies that interpolated strings 
(such as accent marks) must have two backslashes, so they can pass through copy mode intact. For 
instance, the word “téléphone” would have to be represented: 


te\\*‘le\\*’phone 


in order to come out correctly. In the %X field, by contrast, you will have to use single backslashes 
instead. This is because the % X field is not passed through as a string, but as the body of a para- 
graph macro. 


Another problem arises from authors with foreign names. When a name like “Valéry Giscard 
d’Estaing” is turned around by the —a option of refer, it will appear as “d’Estaing, Valéry Giscard,” 
rather than as “Giscard d’Estaing, Valéry.” To prevent this, enter names as follows: 
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%A Vale\\’ry Giscard\0d’Estaing 
%A Alexander Csoma\0de\0Ko\\*:ro\\*:s 


(The second is the name of a famous Hungarian linguist.) The backslash-zero is an nroff/troff 
request meaning to insert a digit-width space. It will protect against faulty name reversal, and also 
against mis-sorting. 


Footnote numbers are placed at the end of the line before the .[ macro. This line should be a 
line of text, not a macro. As an example, if the line before the .[ is a .R macro, then the .R will eat 
the footnote number. (The .R is an —ms request meaning change to Roman font.) In cases where the 
font needs changing, it is necessary to do the following: 


\flet al.\fR 


awk aho kernighan weinberger 
- 


Now the reference will be to Aho et al.? The \fI changes to italics, and the \fR changes back to Roman 
font. Both these requests are nroff/troff requests, not part of —ms. If and when a footnote number 
is added after this sequence, it will indeed appear in the output. 


Internal Details of Refer 


You have already read everything you need to know in order to use the refer bibliography sys- 
tem. The remaining sections are provided ony for extra information, and in case you need to change 
the way refer works. 


The output of refer is a stream of string definitions, one for each field in a reference. To create 
string names, percent signs are simply changed to an open bracket, and an [F string is added, contain- 
ing the footnote number. The %X, %Y and %Z fields are ignored; however, the annobib program 
changes the %X to an .AP (annotation paragraph) macro. The citation used above yields this inter- 
mediate output: 


.ds [F 1 

J- 

.ds [A Mike E. Lesk _ 

.ds [T Some Applications of Inverted Indexes on the Unix System 
.ds {J Unix Programmer’s Manual 
.ds [I Bell Laboratories 

.ds [C Murray Hill, NJ 

.ds [D 1978 

.ds [V 2a 

-nr [T 0 

-nr [A 0 

nr [O 0 

-J[ 1 journal-article 


These string definitions are sent to nroff, which can use the —ms macros defined in 
/usr/lib/mx/tmac.xref to take care of formatting things properly. The initializing macro .]— precedes 
the string definitions, and the labeled macro .][ follows. These are changed from the input .[ and .] so 
that running a file twice through refer is harmless. 


The .][ macro, used to print the reference, is given a type-number argument, which is a numeric 
label indicating the type of reference involved. Here is a list of the various kinds of references: 
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Field Value Kind of Reference 


%I 1 Journal Article 
%B 3 Article in Book 
%R%G 4Report, Government Report 


%I 2 Book 
%M 5 Bell Labs Memorandum (undefined) 
none 0 Other 


The order listed above is indicative of the precedence of the various fields. In other words, a refer- 
ence that has both the %J and %B fields will be classified as a journal article. If none of the fields 
listed is present, then the reference will be classified as “other.” 


The footnote number is flagged in the text with the following sequence, where number is the 
footnote number: 


\*([.number\*(.] 


The \*([. and \*(.] stand for bracketing or superscripting. In nroff with low-resolution devices such 
as the lpr and a crt, footnote numbers will be bracketed. In troff, or on daisy-wheel printers, foot- 
note numbers will be superscripted. Punctuation normally comes before the reference number; this 
can be changed by using the —P (postpunctuation) option of refer. 


In some cases, it is necessary to override certain fields in a reference. For instance, each time a 
work is cited, you may want to specify different page numbers, and you may want to change certain 
fields. This citation will find the Lesk reference, but will add specific page numbers to the output, 
even though no page numbers appeared in the original reference. 


lesk inverted indexes 
%P 7-13 

%1 Computing Services 
%O UNX 12.2.2. 

| 


The %I line will also override any previous publisher information, and the %O line will append some 
commentary. The refer program simply adds the new %P, %I, and %O strings to the output, and 
later strings definitions cancel earlier ones. 


It is also possible to insert an entire citation that does not appear in the bibliographic database. 
This reference, for example, could be added as follows: 


.[ 

%A Brian Kernighan 
%T A Troff Tutorial 
%I_ Bell Laboratories 
%D 1978 


.] 
This will cause refer to interpret the fields exactly as given, without searching the bibliographic data- 


base. This practice is not recommended, however, because it’s better to add new references to the 
database, so they can be used again later. 


If you want to change the way footnote numbers are printed, signals can be given on the .[ and .] 
lines. For example, to say “See reference (2),” the citation should appear as: 
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See reference 
LC 
partial citation 


-)); 


Note that blanks are significant on these signal lines. If a permanent change in the footnote format is 
desired, it’s best to redefine the [. and .] strings. 


Changing the Refer Macros 


This section is provided for those who wish to rewrite or modify the refer macros. This is 
necessary in order to make output correspond to specific journal requirements, or departmental stan- 
dards. First there is an explanation of how new macros can be substituted for the old ones. Then 
several alterations are given as examples. Finally, there is an annotated copy of the refer macros 
used by roffbib . 


The refer macros for nroff/troff supplied by the -ms macro package reside in 
/usr/lib/mx/tmac.xref; they are reference macros, for producing footnotes or endnotes. The refer 
macros used by roffbib, on the other hand, reside in /usr/lib/tmac/tmac.bib; they are for producing a 
stand-alone bibliography. 


To change the macros used by roffbib, you will need to get your own version of this shell script 
into the directory where you are working. These two commands will get you a copy of roffbib and 
the macros it uses: f 


% ep /usr/lib/tmac/tmac.bib bibmac 


You can proceed to change bibmac as much as you like. Then when you use roffbib, you should 
specify your own version of the macros, which will be substituted for the normal ones 


% roffbib —m bibmac filename 


where filename is the name of your bibliography file. Make sure there’s a space between —m and 
bibmac. 


If you want to modify the refer macros for use with nroff and the —ms macros, you will need 
to get a copy of “tmac.xref”: 


% ep /usr/lib/ms/s.ref refmac 


These macros are much like “bibmac’’, except they have .FS and .FE requests, to be used in conjunc- 
tion with the —ms macros, rather than independently defined .XP and .AP requests. Now you can 
put this line at the top of the paper to be formatted: 


“so refmac 


Your new refer macros will override the definitions previously read in by the —ms package. This 
method works only if “refmac” is in the working directory. 


Suppose you didn’t like the way dates are printed, and wanted them to be parenthesized, with 
no comma before. There are five identical lines you will have to change. The first line below is the 
old way, while the second is the new way: 


if ?\\*((D”” , \\*([D\c 
if ”\y([D” \& (\\F(ID)\c 


In the first line, there is a comma and a space, but no parentheses. The “\c” at the end of each line 
indicates to nroff that it should continue, leaving no extra space in the output. The “\&” in the 
second line is the do-nothing character; when followed by a space, a space is sent to the output. 


If you need to format a reference in the style favored by the Modern Language Association or 
Chicago University Press, in the form (city: publisher, date), then you will have to change the middle 
of the book macro [2 as follows: 
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\& (\c 

if 1” \\*([C”” \\*([C: 
\\*({I\c 

iy !”\\*({D”” , \*([D\e 


This would print (Berkeley: Computing Services, 1982) if all three strings were present. The first line 
prints a space and a parenthesis; the second prints the city (and a colon) if present; the third always 
prints the publisher (books must have a publisher, or else they’re classified as other); the fourth line 
prints a comma and the date if present; and the fifth line closes the parentheses. You would need to 
make similar changes to the other macros as well. 
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Some Applications of Inverted Indexes on the UNIX System 
M. E. Lesk 


Bell Laboratories 
Murray Hill, New Jersey 07974 


1. Introduction. 


The UNIX system has many utilities (e.g. grep, awk, lex, egrep, fgrep, ...) to search 
through files of text, but most of them are based on a linear scan through the entire file, using 
some deterministic automaton. This memorandum discusses a program which uses inverted 
indexes! and can thus be used on much larger data bases. 


As with any indexing system, of course, there are some disadvantages; once an index is 
made, the files that have been indexed can not be changed without remaking the index. Thus 
applications are restricted to those making many searches of relatively stable data. Further- 
more, these programs depend on hashing, and can only search for exact matches of whole key- 
words. It is not possible to look for arithmetic or logical expressions (e.g. “date greater than 
1970”) or for regular expression searching such as that in lex.” 


Currently there are two uses of this software, the refer preprocessor to format refer- 
ences, and the lookall command to search through all text files on the UNIX system. 


The remaining sections of this memorandum discuss the searching programs and their 
uses. Section 2 explains the operation of the searching algorithm and describes the data col- 
lected for use with the lookall command. The more important application, refer has a user’s 
description in section 3. Section 4 goes into more detail on reference files for the benefit of 
those who wish to add references to data bases or write new troff macros for use with refer. 
The options to make refer collect identical citations, or otherwise relocate and adjust refer- 
ences, are described in section 5. The UNIX manual sections for refer, lookall, and associated 
commands are attached as appendices. 


2. Searching. 


The indexing and searching process is divided into two phases, each made of two parts. 
These are shown below. 


A. Construct the index. 


(1) Find keys — turn the input files into a sequence of tags and keys, where each tag 
identifies a distinct item in the input and the keys for each such item are the 
strings under which it is to be indexed. 


(2) Hash and sort — prepare a set of inverted indexes from which, given a set of keys, 
the appropriate item tags can be found quickly. 


B. Retrieve an item in response to a query. 


+ UNIX is a trademark of Bell Laboratories. 

1 —. Knuth, The Art of Computer Programming: Vol. 3, Sorting and Searching, Addison-Wesley, Read- 
ing, Mass., 1977. See section 6.5. 

2M. E. Lesk, “Lex — A Lexical Analyzer Generator,” Comp. Sci. Tech. Rep. No. 39, Bell Laboratories, 
Murray Hill, New Jersey, October 1975. 
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(3) Search — Given some keys, look through the files prepared by the hashing and 
sorting facility and derive the appropriate tags. 


(4) Deliver — Given the tags, find the original items. This completes the searching 
process. 


The first phase, making the index, is presumably done relatively infrequently. It should, of 
course, be done whenever the data being indexed change. In contrast, the second phase, 
retrieving items, is presumably done often, and must be rapid. 


An effort is made to separate code which depends on the data being handled from code 
which depends on the searching procedure. The search algorithm is involved only in programs 
(2) and (3), while knowledge of the actual data files is needed only by programs (1) and (4). 
Thus it is easy to adapt to different data files or different search algorithms. 


To start with, it is necessary to have some way of selecting or generating keys from input 
files. For dealing with files that are basically English, we have a key-making program which 
automatically selects words and passes them to the hashing and sorting program (step 2). The 
format used has one line for each input item, arranged as follows: , 


name:start,length (tab) key1 key2 keys ... 


where name is the file name, start is the starting byte number, and length is the number of 
bytes in the entry. 


These lines are the only input used to make the index. The first field (the file name, 
byte position, and byte count) is the tag of the item and can be used to retrieve it quickly. 
Normally, an item is either a whole file or a section of a file delimited by blank lines. After 
the tab, the second field contains the keys. The keys, if selected by the automatic program, 
are any alphanumeric strings which are not among the 100 most frequent words in English 
and which are not entirely numeric (except for four-digit numbers beginning 19, which are 
accepted as dates). Keys are truncated to six characters and converted to lower case. Some 
selection is needed if the original items are very large. We normally just take the first n keys, 
with n less than 100 or so; this replaces any attempt at intelligent selection. One file in our 
system is a complete English dictionary; it would presumably be retrieved for all queries. 


To generate an inverted index to the list of record tags and keys, the keys are hashed 
and sorted to produce an index. What is wanted, ideally, is a series of lists showing the tags 
associated with each key. To condense this, what is actually produced is a list showing the 
tags associated with each hash code, and thus with some set of keys. To speed up access and 
further save space, a set of three or possibly four files is produced. These files are: 


File Contents 
entry Pointers to posting file 
for each hash code 
posting Lists of tag pointers for 
each hash code 


tag Tags for each item 
key Keys for each item 
(optional) 


The posting file comprises the real data: it contains a sequence of lists of items posted under 
each hash code. To speed up searching, the entry file is an array of pointers into the posting 
file, one per potential hash code. Furthermore, the items in the lists in the posting file are not 
referred to by their complete tag, but just by an address in the tag file, which gives the com- 
plete tags. The key file is optional and contains a copy of the keys used in the indexing. 


The searching process starts with a query, containing several keys. The goal is to obtain 
all items which were indexed under these keys. The query keys are hashed, and the pointers 
in the entry file used to access the lists in the posting file. These lists are addresses in the tag 
file of documents posted under the hash codes derived from the query. The common items 
from all lists are determined; this must include the items indexed by every key, but may also 


Some Applications of Inverted Indexes 5-145 


contain some items which are false drops, since items referenced by the correct hash codes 
need not actually have contained the correct keys. Normally, if there are several keys in the 
query, there are not likely to be many false drops in the final combined list even though each 
hash code is somewhat ambiguous. The actual tags are then obtained from the tag file, and to 
guard against the possibility that an item has false-dropped on some hash code in the query, 
the original items are normally obtained from the delivery program (4) and the query keys 
checked against them by string comparison. 


Usually, therefore, the check for bad drops is made against the original file. However, if 
the key derivation procedure is complex, it may be preferable to check against the keys fed to 
program (2). In this case the optional key file which contains the keys associated with each 
item is generated, and the item tag is supplemented by a string 


sstart,length 


which indicates the starting byte number in the key file and the length of the string of keys 
for each item. This file is not usually necessary with the present key-selection program, since 
the keys always appear in the original document. 


There is also an option (-Cn) for coordination level searching. This retrieves items 
which match all but n of the query keys. The items are retrieved in the order of the number 
of keys that they match. Of course, n must be less than the number of query keys (nothing is 
retrieved unless it matches at least one key). 


As an example, consider one set of 4377 references, comprising 660,000 bytes. This 
included 51,000 keys, of which 5,900 were distinct keys. The hash table is kept full to save 
space (at the expense of time); 995 of 997 possible hash codes were used. The total set of 
index files (no key file) included 171,000 bytes, about 26% of the original file size. It took 8 
minutes of processor time to hash, sort, and write the index. To search for a single query with 
the resulting index took 1.9 seconds of processor time, while to find the same paper with a 
sequential linear search using grep (reading all of the tags and keys) took 12.3 seconds of pro- 
cessor time. 


We have also used this software to index all of the English stored on our UNIX system. 
This is the index searched by the lookall command. On a typical day there were 29,000 files 
in our user file system, containing about 152,000,000 bytes. Of these 5,300 files, containing 
32,000,000 bytes (about 21%) were English text. The total number of ‘words’ (determined 
mechanically) was 5,100,000. Of these 227,000 were selected as keys; 19,000 were distinct, 
hashing to 4,900 (of 5,000 possible) different hash codes. The resulting inverted file indexes 
used 845,000 bytes, or about 2.6% of the size of the original files. The particularly small 
indexes are caused by the fact that keys are taken from only the first 50 non-common words 
of some very long input files. 


Even this large lookall index can be searched quickly. For example, to find this docu- 
ment by looking for the keys “lesk inverted indexes” required 1.7 seconds of processor time 
and system time. By comparison, just to search the 800,000 byte dictionary (smaller than 
even the inverted indexes, let alone the 27,000,000 bytes of text files) with grep takes 29 
seconds of processor time. The lookall program is thus useful when looking for a document 
which you believe is stored on-line, but do not know where. For example, many memos from 
our center are in the file system, but it is often difficult to guess where a particular memo 
might be (it might have several authors, each with many directories, and have been worked on 
by a secretary with yet more directories). Instructions for the use of the lookall command are 
given in the manual section, shown in the appendix to this memorandum. 


The only indexes maintained routinely are those of publication lists and all English files. 
To make other indexes, the programs for making keys, sorting them, searching the indexes, 
and delivering answers must be used. Since they are usually invoked as parts of higher-level 
commands, they are not in the default command directory, but are available to any user in the 
directory /usr/lib/refer. Three programs are of interest: mkey, which isolates keys from input 
files; inv, which makes an index from a set of keys; and hunt, which searches the index and 


5-146 Some Applications of Inverted Indexes 


delivers the items. Note that the two parts of the retrieval phase are combined into one pro- 
gram, to avoid the excessive system work and delay which would result from running these as 
separate processes. 


These three commands have a large number of options to adapt to different kinds of 
input. The user not interested in the detailed description that now follows may skip to sec- 
tion 3, which describes the refer program, a packaged-up version of these tools specifically 
oriented towards formatting references. 


Make Keys. The program mkey is the key-making program corresponding to step (1) 
in phase A. Normally, it reads its input from the file names given as arguments, and if there 
are no arguments it reads from the standard input. It assumes that blank lines in the input 
delimit separate items, for each of which a different line of keys should be generated. The 
lines of keys are written on the standard output. Keys are any alphanumeric string in the 
input not among the most frequent words in English and not entirely numeric (except that 
all-numeric strings are acceptable if they are between 1900 and 1999). In the output, keys are 
translated to lower case, and truncated to six characters in length; any associated punctuation 
is removed. The following flag arguments are recognized by mkey: 


—c name Name of file of common words; default is /usr/lib/eign. 
—fname_ Read a list of files from name and take each as an input argu- 


ment. 

—ichars Ignore all lines which begin with ‘%’ followed by any character 
in chars. 

—kn Use at most n keys per input item. 

—In Ignore items shorter than n letters long. 

—nm Ignore as a key any word in the first m words of the list of 
common English words. The default is 100. 

—s Remove the labels (file:start,length) from the output; just give 
the keys. Used when searching rather than indexing. 

—w Each whole file is a separate item; blank lines in files are 
irrelevant. 


The normal arguments for indexing references are the defaults, which are —c 
/usr/lib/eign, —n100, and —/3. For searching, the —s option is also needed. When the big 
lookall index of all English files is run, the options are —w, —k50, and —/ (filelist). When 
running on textual input, the mkey program processes about 1000 English words per processor 
second. Unless the —k option is used (and the input files are long enough for it to take effect) 
the output of mkey is comparable in size to its input. 


Hash and invert. The inv program computes the hash codes and writes the inverted 
files. It reads the output of mkey and writes the set of files described earlier in this section. 
It expects one argument, which is used as the base name for the three (or four) files to be 
written. Assuming an argument of Index (the default) the entry file is named Jndex.ia, the 
posting file Index.ib, the tag file Index.ic, and the key file (if present) Index.id. The inv pro- 
gram recognizes the following options: 


—a Append the new keys to a previous set of inverted files, mak- 
ing new files if there is no old set using the same base name. 
—d Write the optional key file. This is needed when you can not 


check for false drops by looking for the keys in the original 
inputs, i.e. when the key derivation procedure is complicated 
and the output keys are not words from the input files. 

—hn The hash table size is n (default 997); n should be prime. 
Making n bigger saves search time and spends disk space. 
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—i[u] name Take input from file name, instead of the standard input; if u 
is present name is unlinked when the sort is started. Using 
this option permits the sort scratch space to overlap the disk 
space used for input keys. 


—n Make a completely new set of inverted files, ignoring previous 
files. 

—p Pipe into the sort program, rather than writing a temporary 
input file. This saves disk space and spends processor time. 

—v Verbose mode; print a summary of the number of keys which 


finished indexing. 


About half the time used in inv is in the contained sort. Assuming the sort is roughly 
linear, however, a guess at the total timing for inv is 250 keys per second. The space used is 
usually of more importance: the entry file uses four bytes per possible hash (note the —h 
option), and the tag file around 15-20 bytes per item indexed. Roughly, the posting file con- 
tains one item for each key instance and one item for each possible hash code; the items are 
two bytes long if the tag file is less than 65336 bytes long, and the items are four bytes wide if 
the tag file is greater than 65536 bytes long. Note that to minimize storage, the hash tables 
should be over-full; for most of the files indexed in this way, there is no other real choice, 
since the entry file must fit in memory. 


Searching and Retrieving. The hunt program retrieves items from an index. It 
combines, as mentioned above, the two parts of phase (B): search and delivery. The reason 
why it is efficient to combine delivery and search is partly to avoid starting unnecessary 
processes, and partly because the delivery operation must be a part of the search operation in 
any case. Because of the hashing, the search part takes place in two stages: first items are 
retrieved which have the right hash codes associated with them, and then the actual items are 
inspected to determine false drops, i.e. to determine if anything with the right hash codes 
doesn’t really have the right keys. Since the original item is retrieved to check on false drops, 
it is efficient to present it immediately, rather than only giving the tag as output and later 
retrieving the item again. If there were a separate key file, this argument would not apply, 
but separate key files are not common. 


Input to hunt is taken from the standard input, one query per line. Each query should 
be in mkey —s output format; all lower case, no punctuation. The hunt program takes one 
argument which specifies the base name of the index files to be searched. Only one set of 
index files can be searched at a time, although many text files may be indexed as a group, of 
course. If one of the text files has been changed since the index, that file is searched with 
fgrep; this may occasionally slow down the searching, and care should be taken to avoid hav- 
ing many out of date files. The following option arguments are recognized by hunt: 


—a Give all output; ignore checking for false drops. 

—Cn Coordination level n; retrieve items with not more than n 
terms of the input missing; default CO, implying that each 
search term must be in the output items. 

—-F[ynd] ‘“—Fy” gives the text of all the items found; “—Fn” suppresses 
them. “—Fd” where d is an integer gives the text of the first 
d items. The default is —Fy. 

-g Do not use fgrep to search files changed since the index was 
made; print an error comment instead. 

—istring Take string as input, instead of reading the standard input. 

-In The maximum length of internal lists of candidate items is n; 
default 1000. 

—o string Put text output (“—Fy’) in string; of use only when invoked 
from another program. . 
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—p Print hash code frequencies; mostly for use in optimizing hash 
table sizes. 
-T[ynd] ‘“-—Ty” gives the tags of the items found; “—Tn” suppresses 


them. “—Td” where d is an integer gives the first d tags. The 
default is —Tn. 

—t string Put tag output (“—Ty’”) in string; of use only when invoked 
from another program. 


The timing of hunt is complex. Normally the hash table is overfull, so that there will be 
many false drops on any single term; but a multi-term query will have few false drops on all 
terms. Thus if a query is underspecified (one search term) many potential items will be exam- 
ined and discarded as false drops, wasting time. If the query is overspecified (a dozen search . 
terms) many keys will be examined only to verify that the single item under consideration has 
that key posted. The variation of search time with number of keys is shown in the table 
below. Queries of varying length were constructed to retrieve a particular document from the 
file of references. In the sequence to the left, search terms were chosen so as to select the 
desired paper as quickly as possible. In the sequence on the right, terms were chosen 
inefficiently, so that the query did not uniquely select the desired document until four keys 
had been used. The same document was the target in each case, and the final set of eight 
keys are also identical; the differences at five, six and seven keys are produced by measure- 
ment error, not by the slightly different key lists. 


Efficient Keys Inefficient Keys 
No. keys Total drops Retrieved Search time No. keys Total drops Retrieved Search time 
(incl. false) Documents (seconds) (incl. false) Documents (seconds) 
1 15 3 1.27 1 68 55 5.96 
2 1 1 0.11 2 29 29 2.72 
3 1 1 0.14 3 8 8 0.95 
4 1 1 0.17 4 1 0.18 
5 1 1 0.19 5 1 1 0.21 
6 1 1 0.23 6 1 1 0.22 
7 1 1 0.27 7 1 1 0.26 
8 1 1 0.29 8 1 1 0.29 


As would be expected, the optimal search is achieved when the query just specifies the answer; 
however, overspecification is quite cheap. Roughly, the time required by hunt can be approxi- 
mated as 30 milliseconds per search key plus 75 milliseconds per dropped document (whether 
it is a false drop or a real answer). In general, overspecification can be recommended; it pro- 
tects the user against additions to the data base which turn previously uniquely-answered 
queries into ambiguous queries. 


The careful reader will have noted an enormous discrepancy between these times and 
the earlier quoted time of around 1.9 seconds for a search. The times here are purely for the 
search and retrieval: they are measured by running many searches through a single invocation 
of the hunt program alone. The normal retrieval operation involves using the shell to set up a 
pipeline through mkey to hunt and starting both processes; this adds a fixed overhead of 
about 1.7 seconds of processor time to any single search. Furthermore, remember that all 
these times are processor times: on a typical morning on our PDP 11/70 system, with about one 
dozen people logged on, to obtain 1 second of processor time for the search program took 
between 2 and 12 seconds of real time, with a median of 3.9 seconds and a mean of 4.8 
seconds. Thus, although the work involved in a single search may be only 200 milliseconds, 
after you add the 1.7 seconds of startup processor time and then assume a 4:1 
elapsed/processor time ratio, it will be 8 seconds before any response is printed. 
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3. Selecting and Formatting References for TROFF 


The major application of the retrieval software is refer, which is a troff preprocessor like 
eqn.° It scans its input looking for items of the form 


imprecise citation 


| 


where an imprecise citation is merely a string of words found in the relevant bibliographic 
citation. This is translated into a properly formatted reference. If the imprecise citation does 
not correctly identify a single paper (either selecting no papers or too many) a message is 
given. The data base of citations searched may be tailored to each system, and individual 
users may specify their own citation files. On our system, the default data base is accumu- 
lated from the publication lists of the members of our organization, plus about half a dozen 
personal bibliographies that were collected, The present total is about 4300 citations, but this 
increases steadily. Even now, the data base covers a large fraction of local citations. 


For example, the reference for the eqn paper above was specified as 


preprocessor like 

.I eqn. 

[ 

kernighan cherry acm 1975 
] 


It scans its input looking for items 


This paper was itself printed using refer. The above input text was processed by refer as well 
as tbl and troff by the command 


refer memo-file | tbl | troff —ms 


and the reference was automatically translated into a correct citation to the ACM paper on 
mathematical typesetting. 


The procedure to use to place a reference in a paper using refer is as follows. First, use 
the lookbib command to check that the paper is in the data base and to find out what keys 
are necessary to retrieve it. This is done by typing lookbib and then typing some potential 
queries until a suitable query is found. For example, had one started to find the eqn paper 
shown above by presenting the query 


$ lookbib 
kernighan cherry 
(EOT) 


lookbib would have found several items; experimentation would quickly have shown that the 
query given above is adequate. Overspecifying the query is of course harmless. A particularly 
careful reader may have noticed that “acm” does not appear in the printed citation; we have 
supplemented some of the data base items with common extra keywords, such as common 
abbreviations for journals or other sources, to aid in searching. 


If the reference is in the data base, the query that retrieved it can be inserted in the 
text, between .[ and .] brackets. If it is not in the data base, it can be typed into a private file 
of references, using the format discussed in the next section, and then the —p option used to 
search this private file. Such a command might read (if the private references are called 
myfile ) | 


3B. W. Kernighan and L. L. Cherry, “A System for Typesetting Mathematics,” Comm. Assoc. Comp. 
Mach., vol. 18, pp. 151-157, Bell Laboratories, Murray Hill, New Jersey, March 1975. 
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refer —p myfile document leou| eqn | troff —ms... 


where tbl and/or eqn could be omitted if not needed. The use of the —ms macros‘ or some 
other macro package, however, is essential. Refer only generates the data for the references; 
exact formatting is done by some macro package, and if none is supplied the references will 
not be printed. 


By default, the references are numbered sequentially, and the —ms macros format refer- 
ences as footnotes at the bottom of the page. This memorandum is an example of that style. 
Other possibilities are discussed in section 5 below. 


4. Reference Files. 


A reference file is a set of bibliographic references usable with refer. It can be indexed 
using the software described in section 2 for fast searching. What refer does is to read the 
input document stream, looking for imprecise citation references. It then searches through 
reference files to find the full citations, and inserts them into the document. The format of 
the full citation is arranged to make it convenient for a macro package, such as the —ms mac- 
ros, to format the reference for printing. Since the format of the final reference is determined 
by the desired style of output, which is determined by the macros used, refer avoids forcing 
any kind of reference appearance. All it does is define a set of string registers which contain 
the basic information about the reference; and provide a macro call which is expanded by the 
macro package to format the reference. It is the responsibility of the final macro package to 
see that the reference is actually printed; if no macros are used, and the output of refer fed 
untranslated to troff, nothing at all will be printed. 


The strings defined by refer are taken directly from the files of references, which are in 
the following format. The references should be separated by blank lines. Each reference is a 
sequence of lines beginning with % and followed by a key-letter. The remainder of that line, 
and successive lines until the next line beginning with %, contain the information specified by 
the key-letter. In general, refer does not interpret the information, but merely presents it to 
the macro package for final formatting. A user with a separate macro package, for example, 
can add new key-letters or use the existing ones for other purposes without bothering refer. 


The meaning of the key-letters given below, in particular, is that assigned by the —ms 
macros. Not all information, obviously, is used with each citation. For example, if a docu- 
ment is both an internal memorandum and a journal article, the macros ignore the memoran- 
dum version and cite only the journal article. Some kinds of information are not used at all in 
printing the reference; if a user does not like finding references by specifying title or author 
keywords, and prefers to add specific keywords to the citation, a field is available which is 
searched but not printed (K). 


The key letters currently recognized by refer and —ms, with the kind of information 
implied, are: 


4M.E. Lesk, Typing Documents on UNIX and GCOS: The -ms Macros for Troff, 1977. 
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Key Information specified Key Information specified 
A Author’s name N Issue number 
B Title of book containing item O Other information 
C City of publication P Page(s) of article 
D Date R Technical report reference 
E Editor of book containing item T Title 
G Government (NTIS) ordering number V Volume number 
I Issuer (publisher) 
J Journal name 
K Keys (for searching) x or 
L Label Y or 
M Memorandum label Z Information not used by refer 


For example, a sample reference could be typed as: 


%'T Bounds on the Complexity of the Maximal 
Common Subsequence Problem 
%Z ctr127 

% A A. V. Aho 

%AD.S. Hirschberg 

%A J.D. Ullman 

%J J. ACM 

%V 23 

%N1 

%P 1-12 

%M abcd-78 

%D Jan. 1976 


Order is irrelevant, except that authors are shown in the order given. The output of refer is a 
stream of string definitions, one for each of the fields of each reference, as shown below. 


.|- 

.ds [A authors’ names ... 
.ds [T title ... 

.ds [J journal ... 


.] [ type-number 


The special macro .]— precedes the string definitions and the special macro .][ follows. These 
are changed from the input .[ and .] so that running the same file through refer again is 
harmless. The .]— macro can be used by the macro package to initialize. The .][ macro, 
which should be used to print the reference, is given an argument type-number to indicate 
the kind of reference, as follows: 


Value Kind of reference 
1 Journal article 
2 Book 
3 Article within book 
4 Technical report 
5 Bell Labs technical memorandum 
0 Other 


The reference is flagged in the text with the sequence 
\* ([.number\* (.] 


where number is the footnote number. The strings [. and .] should be used by the macro 
package to format the reference flag in the text. These strings can be replaced for a particular 
footnote, as described in section 5. The footnote number (or other signal) is available to the 


5-152 Some Applications of Inverted Indexes 


reference macro .][ as the string register [F. 


In some cases users wish to suspend the searching, and merely use the reference macro 
formatting. That is, the user doesn’t want to provide a search key between .[ and .] brackets, 
but merely the reference lines for the appropriate document. Alternatively, the user can wish 
to add a few fields to those in the reference as in the standard file, or override some fields. 
Altering or replacing fields, or supplying whole references, is easily done by inserting lines 
beginning with %; any such line is taken as direct input to the reference processor rather than 
keys to be searched. Thus 


.[ 

keyl key2 key... 

%Q New format item 

%R Override report name 


| 


makes the indicates changes to the result of searching for the keys. All of the search keys 
must be given before the first % line. 


If no search keys are provided, an entire citation can be provided in-line in the text. For 
example, if the eqn paper citation were to be inserted in this way, rather than by searching 
for it in the data base, the input would read 


preprocessor like 
Il eqn. 


ef 

%A B. W. Kernighan 

%AL. L. Cherry 

%'T A System for Typesetting Mathematics 
% J Comm. ACM 

%V 18 

%N 3 

%P 151-157 

%D March 1975 

. 


It scans its input looking for items 


This would produce a citation of the same appearance as that resulting from the file search. 


As shown, fields are normally turned into troff strings. Sometimes users would rather 
have them defined as macros, so that other troff commands can be placed into the data. 
When this is necessary, simply double the control character % in the data. Thus the input 


.f 

%V 23 

% %M 

Bell Laboratories, 
Murray Hill, N.J. 07974 
.] 


is processed by refer into 


.ds [V 23 

.de [M 

Bell Laboratories, 
Murray Hill, N.J. 07974 


The information after %%M is defined as a macro to be invoked by .[M while the 
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information after %V is turned into a string to be invoked by *([V. At present —ms expects 
all information as strings. 


5. Collecting References and other Refer Options 


Normally, the combination of refer and —ms formats output as troff footnotes which are 
consecutively numbered and placed at the bottom of the page. However, options exist to 
place the references at the end; to arrange references alphabetically by senior author; and to 
indicate references by strings in the text of the form [Namel1975a] rather than by number. 
Whenever references are not placed at the bottom of a page identical references are coalesced. 


For example, the —e option to refer specifies that references are to be collected; in this 
case they are output whenever the sequence 


$LIST$ 
] 


is encountered. Thus, to place references at the end of a paper, the user would run refer with 
the —e option and place the above $LIST$ commands after the last line of the text. Refer 
will then move all the references to that point. To aid in formatting the collected references, 
refer writes the references preceded by the line 


° 
e 


J< 
and followed by the line 
J|> 
to invoke special macros before and after the references. 


Another possible option to refer is the —s option to specify sorting of references. The 
default, of course, is to list references in the order presented. The —s option implies the —e 
option, and thus requires a 


f 
$LIST$ 
| 


entry to call out the reference list. The —s option may be followed by a string of letters, 
numbers, and ‘+’ signs indicating how the references are to be sorted. The sort is done using 
the fields whose key-letters are in the string as sorting keys; the numbers indicate how many 
of the fields are to be considered, with ‘+’ taken as a large number. Thus the default is 
-sAD meaning “Sort on senior author, then date.” To sort on all authors and then title, 
specify -—sA+T. And to sort on two authors and then the journal, write —sA20. 


Other options to refer change the signal or label inserted in the text for each reference. 
Normally these are just sequential numbers, and their exact placement (within brackets, as 
superscripts, etc.) is determined by the macro package. The —1 option replaces reference 
numbers by strings composed of the senior author’s last name, the date, and a disambiguating 
letter. If a number follows the 1 as in —13 only that many letters of the last name are used in 
the label string. Tio abbreviate the date as well the form -lm,n shortens the last name to the 
first m letters and the date to the last n digits. For example, the option —13,2 would refer to 
the eqn paper (reference 3) by the signal Ker75a, since it is the first cited reference by Ker- 
nighan in 1975. 


A user wishing to specify particular labels for a private bibliography may use the —k 
option. Specifying —kx causes the field x to be used as a label. The default is L. If this field 
ends in —, that character is replaced by a sequence letter; otherwise the field is used exactly as 
given. 


If none of the refer-produced signals are desired, the —b option entirely suppresses 
automatic text signals. 
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If the user wishes to override the —ms treatment of the reference signal (which is nor- 
mally to enclose the number in brackets in nroff and make it a superscript in troff) this can 
be done easily. If the lines .[ or .] contain anything following these characters, the remainders 
of these lines are used to surround the reference signal, instead of the default. Thus, for 
example, to say “See reference (2).” and avoid “See reference.”” the input might appear 


See reference 
imprecise citation ... 


.]). 


Note that blanks are significant in this construction. If a permanent change is desired in the 
style of reference signals, however, it is probably easier to redefine the strings [. and .] (which 
are used to bracket each signal) than to change each citation. 


Although normally refer limits itself to retrieving the data for the reference, and leaves 
to a macro package the job of arranging that data as required by the local format, there are 
two special options for rearrangements that can not be done by macro packages. The —c 
option puts fields into all upper case (CAPS-SMALL CAPS in troff output). The key-letters 
indicated what information is to be translated to upper case follow the c, so that —cAJ means 
that authors’ names and journals are to be in caps. The —a option writes the names of 
authors last name first, that is A. D. Hall, Jr. is written as Hall, A. D. Jr. The citation form 
of the Journal of the ACM, for example, would require both —cA and —a options. This pro- 
duces authors’ names in the style KERNIGHAN, B. W. AND CHERRY, L. L. for the previous 
example. The —a option may be followed by a number to indicate how many author names 
should be reversed; —al (without any —c option) would produce Kernighan, B. W. and L. L. 
Cherry, for example. 


Finally, there is also the previously-mentioned —p option to let the user specify a private 
file of references to be searched before the public files. Note that refer does not insist on a 
previously made index for these files. If a file is named which contains reference data but is 
not indexed, it will be searched (more slowly) by refer using fgrep. In this way it is easy for 
users to keep small files of new references, which can later be added to the public data bases. 
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Updating Publication Lists 


M. E. Lesk 


1. Introduction. 


This note describes several commands to update the publication lists. The data base 
consisting of these lists is kept in a set of files in the directory /usr/dict/papers on the Ver- 
sion 7 UNIXT system. The reason for having special commands to update these files is that 
they are indexed, and the only reasonable way to find the items to be updated is to use the 
index. However, altering the files destroys the usefulness of the index, and makes further 
editing difficult. So the recommended procedure is to 


(1) Prepare additions, deletions, and changes in separate files. 
(2) Update the data base and reindex. 


Whenever you make changes, etc. it is necessary to run the “add & index” step before logging 
off; otherwise the changes do not take effect. The next section shows the format of the files in 
the data base. After that, the procedures for preparing additions, preparing changes, prepar- 
ing deletions, and updating the public data base are given. 


2. Publication Format. 


The format of a data base entry is given completely in “Some Applications of Inverted 
Indexes on UNIX” by M. E. Lesk, the first part of this report, and is summarized here via a 
few examples. In each example, first the output format for an item is shown, and then the 
corresponding data base entry. 


Journal article: 
A. V. Aho, D. J. Hirschberg, and J. D. Ullman, “Bounds on the Com- 
plexity of the Maximal Common Subsequence Problem,” J. Assoc. 
Comp. Mach., vol. 23, no. 1, pp. 1-12 (Jan. 1976). 


%T Bounds on the Complexity of the Maximal Common 
Subsequence Problem 

% A A. V. Aho 

%A D. S. Hirschberg 

%A J. D. Ullman 

%J J. Assoc. Comp. Mach. 
%NV 23 

%N 1 

%P 1-12 

%D Jan. 1976 

%M Memo abcd... 


+ UNIX is a trademark of Bell Laboratories. 
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Conference proceedings: 
B. Prabhala and R. Sethi, “Efficient Computation of Expressions with 
Common Subexpressions,” Proc. 5th ACM Symp. on Principles of 
Programming Languages, pp. 222-230, Tucson, Ariz. (January 1978). 


% A B. Prabhala 

% AR. Sethi . 

%'T Efficient Computation of Expressions with 
Common Subexpressions 

%J Proc. 5th ACM Symp. on Principles 

of Programming Languages 

%C Tucson, Ariz. 

%D January 1978 

%P 222-230 


Book: 


B. W. Kernighan and P. J. Plauger, Software Tools, Addison-Wesley, 
Reading, Mass. (1976). 


%'T Software Tools 
% A B. W. Kernighan 
%AP. J. Plauger 

%I Addison-Wesley 
%C Reading, Mass. 
%D 1976 


Article within book: 
J. W. de Bakker, “Semantics of Programming Languages,” pp. 173-227 
in Advances in Information Systems Science, Vol. 2, ed. J. T. Tou, 
Plenum Press, New York, N. Y. (1969). 


%A J. W. de Bakker 

%'T Semantics of programming languages 

%E J.T. Tou 

%B Advances in Information Systems Science, Vol. 2 
%I Plenum Press 

%C New York, N. Y. 

%D 1969 

%P 173-227 


Technical Report: 
F. E. Allen, “Bibliography on Program Optimization,” Report RC- 
5767, IBM T. J. Watson Research Center, Yorktown Heights, N. Y. 
(1975). 


%AF.E. Allen 

%D 1975 

%'T Bibliography on Program Optimization 
% R Report RC-5767 

%I1IBM T. J. Watson Research Center 

%C Yorktown Heights, N. Y. 
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Other forms of publication can be entered similarly. Note that conference proceedings are 
entered as if journals, with the conference name on a %JdJ line. This is also sometimes 
appropriate for obscure publications such as series of lecture notes. When something is both a 
report and an article, or both a memorandum and an article, enter all necessary information 
for both; see the first article above, for example. Extra information (such as “In preparation” 
or “Japanese translation’) should be placed on a line beginning %O. The most common use 
of %O lines now is for “Also in ...” to give an additional reference to a secondary appearance 
of the same paper. 


Some of the possible fields of a citation are: 


Letter Meaning Letter Meaning 
A Author K Extra keys 
B Book including item N Issue number 
C City of publication O Other 
D Date P Page numbers 
E Editor of book R Report number 
I Publisher (issuer) T Title of item 
J Journal name Vv Volume number 


Note that %B is used to indicate the title of a book containing the article being entered; when 
an item is an entire book, the title should be entered with a %T as usual. 


Normally, the order of items does not matter. The only exception is that if there are 
multiple authors (%A lines) the order of authors should be that on the paper. If a line is too 
long, it may be continued on to the next line; any line not beginning with % or . (dot) is 
assumed to be a continuation of the previous line. Again, see the first article above for an 
example of a long title. Except for authors, do not repeat any items; if two %J lines are 
given, for example, the first is ignored. Multiple items on the same file should be separated 
by blank lines. 


Note that in formatted printouts of the file, the exact appearance of the items is deter- 
mined by a set of macros and the formatting programs. Do not try to adjust fonts, punctua- 
tion, etc. by editing the data base; it is wasted effort. In case someone has a real need for a 
differently-formatted output, a new set of macros can easily be generated to provide alterna- 
tive appearances of the citations. 


3. Updating and Re-indexing. 

This section describes the commands that are used to manipulate and change the data 
base. It explains the procedures for (a) finding references in the data base, (b) adding new 
references, (c) changing existing references, and (d) deleting references. Remember that all 
changes, additions, and deletions are done by preparing separate files and then running an 
‘update and reindex’ step. 


Checking what’s there now. Often you will want to know what is currently in the data 
base. There is a special command lookbib to look for things and print them out. It searches 
for articles based on words in the title, or the author’s name, or the date. For example, you 
could find the first paper above with 


lookbib aho ullman maximal subsequence 1976 
or 


lookbib aho ullman hirschberg 


If you don’t give enough words, several items will be found; if you spell some wrong, nothing 
will be found. There are around 4300 papers in the public file; you should always use this 
command to check when you are not sure whether a certain paper is there or not. 


Additions. To add new papers, just type in, on one or more files, the citations for the 
new papers. Remember to check first if the papers are already in the data base. For example, 
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if a paper has a previous memo version, this should be treated as a change to an existing 
entry, rather than a new entry. If several new papers are being typed on the same file, be sure 
that there is a blank line between each two papers. 


Changes. To change an item, it should be extracted onto a file. This is done with the 
command 


pub.chg keyl key2 key ... 


where the items keyl, key2, key3, etc. are a set of keys that will find the paper, as in the look- 
bib command. That is, if 


lookbib johnson yacc cstr 


will find a item (to, in this case, Computing Science Technical Report No. 32, “YACC: Yet 
Another Compiler-Compiler,” by S. C. Johnson) then 


pub.chg johnson yacc cstr 


will permit you to edit the item. The pub.chg command extracts the item onto a file named 
“bibxxx” where “xxx” is a 3-digit number, e.g. “bib234”. The command will print the file 
name it has chosen. If the set of keys finds more than one paper (or no papers) an error mes- 
sage is printed and no file is written. Each reference to be changed must be extracted with a 
separate pub.chg command, and each will be placed on a separate file. You should then edit 
the “bibxxx” file as desired to change the item, using the UNIX editor. Do not delete or 
change the first line of the file, however, which begins %# and is a special code line to tell the 
update program which item is being altered. You may delete or change other lines, or add 
lines, as you wish. The changes are not actually made in the public data base until you run 
the update command pub.run (see below). Thus, if after extracting an item and modifying it, 
you decide that you’d rather leave things as they were, delete the “bibxxx” file, and your 
change request will disappear. 


Deletions. To delete an entry from the data base, type the command 
pub.del keyl key2 keys ... 


where the items keyl, key2, etc. are a set of keys that will find the paper, as with the lookbib 
command. That is, if 


lookbib Aho hirschberg ullman 
will find a paper, 
pub.del aho hirschberg ullman 


deletes it. Note that upper and lower case are equivalent in keys. The pub.del command will 
print the entry being deleted. It also gives the name of a “bibxxx’’ file on which the deletion 
command is stored. The actual deletion is not done until the changes, additions, etc. are pro- 
cessed, as with the pub.chg command. If, after seeing the item to be deleted, you change your 
mind about throwing it away, delete the “bibxxx” file and the delete request disappears. 
Again, if the list of keys does not uniquely identify one paper, an error message is given. 


Remember that the default versions of the commands described here edit a public data 
base. Do not delete items unless you are sure deletion is proper; usually this means that there 
are duplicate entries for the same paper. Otherwise, view requests for deletion with skepti- 
cism; even if one person has no need for a particular item in the data base, someone else may 
want it there. 


If an item is correct, but should not appear in the “List of Publications” as normally 
produced, add the line 


%K DNL 


to the item. This preserves the item intact, but implies “Do Not List” to the to the com- 
mands that print publication lists. The DNL line is normally used for some technical reports, 
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minor memoranda, or other low-grade publications. 


Update and reindex. When you have completed a session of changes, you should type 
the command 


pub.run filel file2 ... 


where the names “file1”, ... are the new files of additions you have prepared. You need not 
list the “bibxxx” files representing changes and deletions; they are processed automatically. 
All of the new items are edited into the standard public data base, and then a new index is 
made. This process takes about 15 minutes; during this time, searches of the data base will be 
slower. 


Normally, you should execute pub.run just before you logoff after performing some edit 
requests. However, if you don’t, the various change request files remain in your directory 
until you finally do execute pub.run. When the changes are processed, the “bibxxx’”’ files are 
deleted. It is not desirable to wait too long before processing changes, however, to avoid 
conflicts with someone else who wishes to change the same file. If executing pub.run produces 
the message “File bibxxx too old” it means that someone else has been editing the same file 
between the time you prepared your changes, and the time you typed pub.run. You must 
delete such old change files and re-enter them. 


Note that although pub.run discards the “‘bibxxx” files after processing them, your files 
of additions are left around even after pub.run is finished. If they were typed in only for pur- 
poses of updating the data base, you may delete them after they have been processed by 
pub.run. 


Example. Suppose, for example, that you wish to 


(1) Add to the data base the memos “The Dilogarithm Function of a Real Argument” by R. 
Morris, and “UNIX Software Distribution by Communication Link,” by M. E. Lesk and 
A. S. Cohen; 


(2) Delete from the data base the item “Cheap Typesetters”, by M. E. Lesk, SIGLASH 
Newsletter, 1973; and 


(3) Change “J. Assoc. Comp. Mach.” to “Jour. ACM” in the citation for Aho, Hirschberg, 
and Ullman shown above. 


The procedure would be as follows. First, you would make a file containing the additions, 
here called “new.1”, in the normal way using the UNIX editor. In the script shown below, the 
computer prompts are in italics. 


$ ed new.1 

? 

a 

%'T The Dilogarithm Function of a Real Argument 
% A Robert Morris 

%M abcd 

%D 1978 


%'T UNIX Software Distribution by Communication Link 
% AM. E. Lesk 

%A A. S. Cohen 

%M abcd 

%D 1978 

w new. 1 

199 


q 
Next you would specify the deletion, which would be done with the pub.del command: 
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$ pub.del lesk cheap typesetters siglash 
to which the computer responds: 


Will delete: (file bib176) 


% T Cheap Typesetters 

%AM. E. Lesk 

%J ACM SIGLASH Newsletter 
%VE 

%N 4 

%P 14-16 

%D October 1973 


And then you would extract the Aho, Hirschberg and Ullman paper. The dialogue involved is 
shown below. First run pub.chg to extract the paper; it responds by printing the citation and 
informing you that it was placed on file b1b123. That file is then edited. 
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$ pub.chg aho hirschberg ullman 
Extracting as file bib123 

% T Bounds on the Complexity of the Maximal 
Common Subsequence Problem 
%A A. V. Aho 

% A D.S. Hirschberg 

%A J.D. Ullman 

%J J. Assoc. Comp. Mach. 

%V 23 

%N 1 

%P 1-12 

%M abcd 

%D Jan. 1976 


$ ed bib123 

312 

/Assoc/s/ J/ Jour/p 

%J Jour. Assoc. Comp. Mach. 
s/Assoc.*/ACM/p 

%J Jour. ACM 

1,$p 

%# /usr/dict/papers/p76 233 245 change 
%T Bounds on the Complexity of the Maximal 
Common Subsequence Problem 

%A A. V. Aho 

%A D. S. Hirschberg 

%A J.D. Ullman 

%J Jour. ACM 

%V 23 

%N 1 

%P 1-12 

%M abcd 

%D Jan. 1976 


w 
292 


q 

$ 
Finally, execute pub.run, making sure to remember that you have prepared a new file 
“new.1”: 

$ pub.run new.1 


and about fifteen minutes later the new index would be complete and all the changes would be 
included. 


4, Printing a Publication List 


There are two commands for printing a publication list, depending on whether you want 
to print one person’s list, or the list of many people. To print a list for one person, use the 
pub.indiv command: 


pub.indiv M Lesk 


This runs off the list for M. Lesk and puts it in file “output”. Note that no ‘.’ is given after 
the initial. In case of ambiguity two initials can be used. Similarly, to get the list for group of 
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people, say 
pub.org xxx 


which prints all the publications of the members of organization xxx, taking the names for the 
list in the file /usr/dict/papers/centlist/xxx. This command should normally be run in the 
background; it takes perhaps 15 minutes. Two options are available with these commands: 


pub.indiv —p M Lesk 
prints only the papers, leaving out unpublished notes, patents, etc. Also 
pub.indiv —t M Lesk | gcat 


prints a typeset copy, instead of a computer printer copy. In this case it has been directed to 
an alternate typesetter with the ‘gcat? command. These options may be used together, and 
may be used with the pub.org command as well. For example, to print only the papers for all 
of organization zzz and typeset them, you could type 


pub.center —t —p zzz| gcat & 


These publication lists are printed double column with a citation style taken from a set of 
publication list macros; the macros, of course, can be changed easily to adjust the format of 
the lists. 
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Writing Tools - The STYLE and DICTION Programs 
L. L. Cherry 


Bell Laboratories 
Murray Hill, New Jersey 07974 


W. Vesterman 


Livingston College 
Rutgers University 


1. Introduction 


Computers have become important in the document preparation process, with programs 
to check for spelling errors and to format documents. As the amount of text stored on line 
increases, it becomes feasible and attractive to study writing style and to attempt to help the 
writer in producing readable documents. The system of writing tools described here is a first 
step toward such help. The system includes programs and a data base to analyze writing style 
at the word and sentence level. We use the term “style” in this paper to describe the results 
of a writer’s particular choices among individual words and sentence forms. Although many 
judgements of style are subjective, particularly those of word choice, there are some objective 
measures that experts agree lead to good style. Three programs have been written to measure 
some of the objectively definable characteristics of writing style and to identify some com- 
monly misused or unnecessary phrases. Although a document that conforms to the stylistic 
rules is not guaranteed to be coherent and readable, one that violates all of the rules is likely 
to be difficult or tedious to read. The program STYLE calculates readability, sentence length 
variability, sentence type, word usage and sentence openers at a rate of about 400 words per 
second on a PDP11/70 running the UNIXt Operating System. It assumes that the sentences 
are well-formed, i. e. that each sentence has a verb and that the subject and verb agree in 
number. DICTION identifies phrases that are either bad usage or unnecessarily wordy. 
EXPLAIN acts as a thesaurus for the phrases found by DICTION. Sections 2, 3, and 4 
describe the programs; Section 5 gives the results on a cross-section of technical documents; 
Section 6 discusses accuracy and problems; Section 7 gives implementation details. 


2. STYLE 


The program STYLE reads a document and prints a summary of readability indices, 
sentence length and type, word usage, and sentence openers. It may also be used to locate all 
sentences in a document longer than a given length, of readability index higher than a given 
number, those containing a passive verb, or those beginning with an expletive. STYLE is 
based on the system for finding English word classes or parts of speech, PARTS [1]. PARTS 
is a set of programs that uses a small dictionary (about 350 words) and suffix rules to partially 
assign word classes to English text. It then uses experimentally derived rules of word order to 
assign word classes to all words in the text with an accuracy of about 95%. Because PARTS 
uses only a small dictionary and general rules, it works on text about any subject, from phy- 
sics to psychology. Style measures have been built into the output phase of the programs that 
make up PARTS. Some of the measures are simple counters of the word classes found by 
PARTS; many are more complicated. For example, the verb count is the total number of verb 


+ UNIX is a trademark of Bell Laboratories. 
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phrases. This includes phrases like: 


has been going 
was only going 
to go 


each of which each counts as one verb. Figure 1 shows the output of STYLE run on a paper 
by Kernighan and Mashey about the UNIX programming environment [2]. 





programming environment 
readability grades: 


(Kincaid) 12.3 (auto) 12.8 (Coleman-Liau) 11.8 (Flesch) 13.5 (46.3) 


sentence info: 
no. sent 335 no. wds 7419 
av sent leng 22.1 av word leng 4.91 
no. questions 0 no. imperatives 0 
no. nonfunc wds 4362 58.8% av leng 6.38 
short sent (<17) 35% (118) long sent (>32) 16% (55) 
longest sent 82 wds at sent 174; shortest sent 1 wds at sent 117 
sentence types: 
simple 34% (114) complex 32% (108) 
compound 12% (41) compound-complex 21% (72) 
word usage: 
verb types as % of total verbs 
tobe 45% (373) aux 16% (133) inf 14% (114) 
passives as % of non-inf verbs 20% (144) 
types as % of total 
prep 10.8% (804) conj 3.5% (262) adv 4.8% (354) 
noun 26.7% (1983) adj 18.7% (1388) pron 5.3% (393) 
nominalizations 2 % (155) 
sentence beginnings: 


subject opener: noun (63) pron (438) pos (0) adj (58) art (62) tot 67% 


prep 12% (39) adv 9% (81) 
verb 0% (1) subconj 6% (20) conj 1% (5) 
NTI TI PRE eee _expletives 4% (13) 








Figure 1 


As the example shows, STYLE output is in five parts. After a brief discussion of sentences, 
we will describe the parts in order. 


2.1. What is a sentence? 


Readers of documents have little trouble deciding where the sentences end. People don’t 
even have to stop and think about uses of the character “.” in constructions like 1.25, A. J. 
Jones, Ph.D., i. e., or etc... When a computer reads a document, finding the end of sentences 
is not as easy. First we must throw away the printer’s marks and formatting commands that 
litter the text in computer form. Then STYLE defines a sentence as a string of words ending 
in one of: 


ee A 
The end marker “/.” may be used to indicate an imperative sentence. Imperative sentences 


that are not so marked are not identified as imperative. STYLE properly handles numbers 
with embedded decimal points and commas, strings of letters and numbers with embedded 
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decimal points used for naming computer file names, and the common abbreviations listed in 
Appendix 1. Numbers that end sentences, like the preceding sentence, cause a sentence break 
if the next word begins with a capital letter. Initials only cause a sentence break if the next 
word begins with a capital and is found in the dictionary of function words used by PARTS. 
So the string 


J. D. JONES 
does not cause a break, but the string 
.. system H. The ... 


does. With these rules most sentences are broken at the proper place, although occasionally 
either two sentences are called one or a fragment is called a sentence. More on this later. 


2.2. Readability Grades 


The first section of STYLE output consists of four readability indices. As Klare points 
out in [3] readability indices may be used to estimate the reading skills needed by the reader 
to understand a document. The readability indices reported by STYLE are based on meas- 
ures of sentence and word lengths. Although the indices may not measure whether the docu- 
ment is coherent and well organized, experience has shown that high indices seem to be indi- 
cators of stylistic difficulty. Documents with short sentences and short words have low scores; 
those with long sentences and many polysyllabic words have high scores. The 4 formulae 
reported are Kincaid Formula [4], Automated Readability Index [5], Coleman-Liau Formula 
[6] and a normalized version of Flesch Reading Ease Score [7]. The formulae differ because 
they were experimentally derived using different texts and subject groups. We will discuss 
each of the formulae briefly; for a more detailed discussion the reader should see [38]. 


The Kincaid Formula, given by: 
Reading Grade=11.8*syl per wds.389*wds per sent-15.59 


was based on Navy training manuals that ranged in difficulty from 5.5 to 16.3 in reading grade 
level. The score reported by this formula tends to be in the mid-range of the 4 scores. 
Because it is based on adult training manuals rather than school book text, this formula is 
probably the best one to apply to technical documents. 


The Automated Readability Index (ARI), based on text from grades 0 to 7, was derived 
to be easy to automate. The formula is: 
Reading Grade=.71*let per wds.5*wds per sent-21.43 
ARI tends to produce scores that are higher than Kincaid and Coleman-Liau but are usually 
slightly lower than Flesch. 
The Coleman-Liau Formula, based on text ranging in difficulty from .4 to 16.3, is: 
Reading Grade=5.89*let per wd-.3*sent per 100 wds-15.8 
Of the four formulae this one usually gives the lowest grade when applied to technical docu- 
ments. 
The last formula, the Flesch Reading Ease Score, is based on grade school text covering 
grades 3 to 12. The formula, given by: 
Reading Score 206.835-84.6*syl per wd-1.015*wds per sent 


is usually reported in the range 0 (very difficult) to 100 (very easy). The score reported by 
STYLE is scaled to be comparable to the other formulas, except that the maximum grade 
level reported is set to 17. The Flesch score is usually the highest of the 4 scores on technical 
documents. 


Coke [8] found that the Kincaid Formula is probably the best predictor for technical 
documents; both ARI and Flesch tend to overestimate the difficulty; Coleman-Liau tend to 
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underestimate. On text in the range of grades 7 to 9 the four formulas tend to be about the 
same. On easy text the Coleman-Liau formula is probably preferred since it is reasonably 
accurate at the lower grades and it is safer to present text that is a little too easy than a little 
too hard. 


If a document has particularly difficult technical content, especially if it includes a lot of 
mathematics, it is probably best to make the text very easy to read, i.e. a lower readability 
index by shortening the sentences and words. This will allow the reader to concentrate on the 
technical content and not the long sentences. The user should remember that these indices 
are estimators; they should not be taken as absolute numbers. STYLE called with ‘“—r 
number” will print all sentences with an Automated Readability Index equal to or greater 
than “number”. 


2.3. Sentence length and structure 


The next two sections of STYLE output deal with sentence length and structure. 
Almost all books on writing style or effective writing emphasize the importance of variety in 
sentence length and structure for good writing. Ewing’s first rule in discussing style in the 
book Writing for Results [9] is: 


“Vary the sentence structure and length of your sentences.” 


Leggett, Mead and Charvat break this rule into 3 in Prentice-Hall Handbook for Writers [10] 
as follows: 


“34a. Avoid the overuse of short simple sentences.” 
*34b. Avoid the overuse of long compound sentences.” 
“34c. Use various sentence structures to avoid monotony and increase effectiveness.” 


Although experts agree that these rules are important, not all writers follow them. Sample 
technical documents have been found with almost no sentence length or type variability. One 
document had 90% of its sentences about the same length as the average; another was made 
up almost entirely of simple sentences (80% ). 


The output sections labeled ‘sentence info” and “sentence types” give both length and 
structure measures. STYLE reports on the number and average length of both sentences and 
words, and number of questions and imperative sentences (those ending in “/.”). The meas- 
ures of non-function words are an attempt to look at the content words in the document. In 
English non-function words are nouns, adjectives, adverbs, and non-auxiliary verbs; function 
words are prepositions, conjunctions, articles, and auxiliary verbs. Since most function words 
are short, they tend to lower the average word length. The average length of non-function 
words may be a more useful measure for comparing word choice of different writers than the 
total average word length. The percentages of short and long sentences measure sentence 
length variability. Short sentences are those at least 5 words less than the average; long sen- 
tences are those at least 10 words longer than the average. Last in the sentence information 
section is the length and location of the longest and shortest sentences. If the flag “—l 
number” is used, STYLE will print all sentences longer than “number”. 


Because of the difficulties in dealing with the many uses of commas and conjunctions in 
English, sentence type definitions vary slightly from those of standard textbooks, but still 
measure the same constructional activity. 


1. A simple sentence has one verb and no dependent clause. 


2. A complex sentence has one independent clause and one dependent clause, each with 
one verb. Complex sentences are found by identifying sentences that contain either a 
subordinate conjunction or a clause beginning with words like “that” or “who”. The 
preceding sentence has such a clause. 

3. A compound sentence has more than one verb and no dependent clause. Sentences 


66,99 


joined by “;” are also counted as compound. 
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4. A compound-complex sentence has either several dependent clauses or one dependent 
clause and a compound verb in either the dependent or independent clause. 


Even using these broader definitions, simple sentences dominate many of the technical 
documents that have been tested, but the example in Figure 1 shows variety in both sentence 
structure and sentence length. 


2.4. Word Usage 


The word usage measures are an attempt to identify some other constructional features 
of writing style. There are many different ways in English to say the same thing. The con- 
structions differ from one another in the form of the words used. The following sentences all 
convey approximately the same meaning but. differ in word usage: 


The cxio program is used to perform all communication between the systems. 
The cxio program performs all communications between the systems. 

The cxio program is used to communicate between the systems. 

The cxio program communicates between the systems. 

All communication between the systems is performed by the cxio program. 


The distribution of the parts of speech and verb constructions helps identify overuse of par- 
ticular constructions. Although the measures used by STYLE are crude, they do point out 
problem areas. For each category, STYLE reports a percentage and a raw count. In addition 
to looking at the percentage, the user may find it useful to compare the raw count with the 
number of sentences. If, for example, the number of infinitives is almost equal to the number 
of sentences, then many of the sentences in the document are constructed like the first and 
third in the preceding example. The user may want to transform some of these sentences into 
another form. Some of the implications of the word usage measures are discussed below. 


Verbs are measured in several different ways to try to determine what types of verb construc- 
tions are most frequent in the document. Technical writing tends to contain many pas- 
sive verb constructions and other usage of the verb “to be”. The category of verbs 
labeled “tobe” measures both passives and sentences of the form: 


subject tobe predicate 


In counting verbs, whole verb phrases are counted as one verb. Verb phrases containing 
auxiliary verbs are counted in the category “aux”. The verb phrases counted here are 
those whose tense is not simple present or simple past. It might eventually be useful to 
do more detailed measures of verb tense or mood. Infinitives are listed as “inf”. The 
percentages reported for these three categories are based on the total number of verb 
phrases found. These categories are not mutually exclusive; they cannot be added, since, 
for example, “to be going” counts as both “tobe” and “‘inf’’. Use of these three types of 
verb constructions varies significantly among authors. 


STYLE reports passive verbs as a percentage of the finite verbs in the document. Most 
style books warn against the overuse of passive verbs. Coleman [11] has shown that sen- 
tences with active verbs are easier to learn than those with passive verbs. Although the 
inverted object-subject order of the passive voice seems to emphasize the object, 
Coleman’s experiments showed that there is little difference in retention by word posi- 
tion. He also showed that the direct object of an active verb is retained better than the 
subject of a passive verb. These experiments support the advice of the style books sug- 
gesting that writers should try to use active verbs wherever possible. The flag “—p” 
causes STYLE to print all sentences containing passive verbs. 


Pronouns 
add cohesiveness and connectivity to a document by providing back-reference. They are 
often a short-hand notation for something previously mentioned, and therefore connect 
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the sentence containing the pronoun with the word to which the pronoun refers. 
Although there are other mechanisms for such connections, documents with no pronouns 
tend to be wordy and to have little connectivity. 


Adverbs 
can provide transition between sentences and order in time and space. In performing 
these functions, adverbs, like pronouns, provide connectivity and cohesiveness. 


Conjunctions 
provide parallelism in a document by connecting two or more equal units. These units 
may be whole sentences, verb phrases, nouns, adjectives, or prepositional phrases. The 
compound and compound-complex sentences reported under sentence type are parallel 
structures. Other uses of parallel structures are indicated by the degree that the number 
of conjunctions reported under word usage exceeds the compound sentence measures. 


Nouns and Adjectives. 
A ratio of nouns to adjectives near unity may indicate the over-use of modifiers. Some 
technical writers qualify every noun with one or more adjectives. Qualifiers in phrases 
like “simple linear single-link network model” often lend more obscurity than precision 
to a text. 


Nominalizations 
are verbs that are changed to nouns by adding one of the suffixes “ment”, “ance”, 
“ence”, or “ioh”. Examples are accomplishment, admittance, adherence, and abbrevia- 
tion. When a writer transforms a nominalized sentence to a non-nominalized sentence, 
she/he increases the effectiveness of the sentence in several ways. The noun becomes an 
active verb and frequently one complicated clause becomes two shorter clauses. For 
example, 


Their inclusion of this provision is admission of the importance of the system. 
When they included this provision, they admitted the importance of the system. 


Coleman found that the transformed sentences were easier to learn, even when the 
transformation produced sentences that were slightly longer, provided the transforma- 
tion broke one clause into two. Writers who find their document contains many nomi- 
nalizations may want to transform some of the sentences to use active verbs. 


2.5. Sentence openers 


Another agreed upon principle of style is variety in sentence openers. Because STYLE 
determines the type of sentence opener by looking at the part of speech of the first word in 
the sentence, the sentences counted under the heading “subject opener” may not all really 
begin with the subject. However, a large percentage of sentences in this category still indi- 
cates lack of variety in sentence openers. Other sentence opener measures help the user 
determine if there are transitions between sentences and where the subordination occurs. 
Adverbs and conjunctions at the beginning of sentences are mechanisms for transition 
between sentences. A pronoun at the beginning shows a link to something previously men- 
tioned and indicates connectivity. 


The location of subordination can be determined by comparing the number of sentences 
that begin with a subordinator with the number of sentences with complex clauses. If few 
sentences start with subordinate conjunctions then the subordination is embedded or at the 
end of the complex sentences. For variety the writer may want to transform some sentences 
to have leading subordination. 


The last category of openers, expletives, is commonly overworked in technical writing. 
Expletives are the words “it” and “there”, usually with the verb “to be”, in constructions 
where the subject follows the verb. For example, 
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There are three streets used by the traffic. 
There are too many users on this system. 


This construction tends to emphasize the object rather than the subject of the sentence. The 
flag ““—e” will cause STYLE to print all sentences that begin with an expletive. 


3. DICTION 


The program DICTION prints all sentences in a document containing phrases that are 
either frequently misused or indicate wordiness. The program, an extension of Aho’s FGREP 
[12] string matching program, takes as input a file of phrases or patterns to be matched and a 
file of text to be searched. A data base of about 450 phrases has been compiled as a default 
pattern file for DICTION. Before attempting to locate phrases, the program maps upper case 
letters to lower case and substitutes blanks for punctuation. Sentence boundaries were 
deemed less critical in DICTION than in STYLE, so abbreviations and other uses of the char- 
acter “.” are not treated specially. DICTION brackets all pattern matches in a sentence with 
the characters “[”’ “]” . Although many of the phrases in the default data base are correct in 
some contexts, in others they indicate wordiness. Some examples of the phrases and sug- 
gested alternatives are: 


Phrase Alternative 
a large number of many 
arrive at a decision decide 
collect together collect 
for this reason so 
pertaining to about 
through the use of by or with 
utilize use 


with the exception of except 


Appendix 2 contains a complete list of the default file. Some of the entries are short forms of 
problem phrases. For example, the phrase “the fact” is found in all of the following and is 
sufficient to point out the wordiness to the user: 


Phrase Alternative 
accounted for by the fact that caused by 
an example of this is the fact that thus 
based on the fact that because 
despite the fact that although 
due to the fact that because 
in light of the fact that because 
in view of the fact that since 
notwithstanding the fact that although 


66~99 


Entries in Appendix 2 preceded by are not matched. See Section 7 for details on the use 


of 66~99 


The user may supply her/his own pattern file with the flag “—f patfile”. In this case the 
default file will be loaded first, followed by the user file. This mechanism allows users to 
suppress patterns contained in the default file or to include their own pet peeves that are not 
in the default file. The flag “—n” will exclude the default file altogether. In constructing a 
pattern file, blanks should be used before and after each phrase to avoid matching substrings 
in words. For example, to find all occurrences of the word “the”, the pattern “ the ” should 
be used. The blanks cause only the word “the” to be matched and not the string “the” in 
words like there, other, and therefore. One side effect of surrounding the words with blanks is 
that when two phrases occur without intervening words, only the first will be matched. 
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4, EXPLAIN 


The last program, EXPLAIN, is an interactive thesaurus for phrases found by DIC- 
TION. The user types one of the phrases bracketed by DICTION and EXPLAIN responds 
with suggested substitutions for the phrase that will improve the diction of the document. 


Table 1 
Text Statistics on 20 Technical Documents 








Readability Kincaid 9.5 16.9 13. 2.2 
automated 9.0 17.4 13.3 2.5 
Cole-Liau 10.0 16.0 12.7 1.8 
Bscehe 3 i 5 NORE a Ce es a Ae ot St 
sentence info. av sent length 15.5 30.3 21.6 4.0 
av word length 4.61 5.63 5.08 .29 
av nonfunction length 5.72 7.30 6.52 45 
short sent 23% 46% 33% 5.9 
Se Ong Sent IH 0% An 2D 
sentence types simple 31% 71% 49% 11.4 
complex 19% 50% 33% 8.3 
compound 2% 14% 7% 3.3 
_Ccompound-complex 2% dG 
verb types tobe 26% 64% 44.7% 10.3 
. auxiliary 10% 40% 21% 8.7 
infinitives 8% 24% 15.1% 4.8 
et nese ot al a ent 5 aod PS SIVON ceo Zaz tects 5 I Or 8 I ae ie EO et Sp oh as 
word usage prepositions 10.1% 15.0% 12.3% 1.6 
conjunction 1.8% 4.8% 3.4% 9 
adverbs 1.2% 5.0% 3.4% 1.0 
nouns 23.6% 31.6% 27.8% 1.7 
adjectives 15.4% 27.1% 21.1% 3.4 
pronouns 1.2% 8.4% 2.5% 1.1 
sentence openers _ prepositions 6% 19% 12% 3.4 
adverbs 0% 20% 9% 4.6 
subject 56% 85% 70% 8.0 
verbs 0% 4% 1% 1.0 
subordinating conj 1% 12% 5% 2.7 
conjunctions 0% 4% 0% 1.5 
expletives 0% 6% 2% 1.7 
5. Results 
5.1. STYLE 


To get baseline statistics and check the program’s accuracy, we ran STYLE on 20 techni- 
cal documents. There were a total of 3287 sentences in the sample. The shortest document 
was 67 sentences long; the longest 339 sentences. The documents covered a wide range of sub- 
ject matter, including theoretical computing, physics, psychology, engineering, and affirmative 
action. Table 1 gives the range, median, and standard deviation of the various style measures. 
As you will note most of the measurements have a fairly wide range of values across the sam- 
ple documents. 


As a comparison, Table 2 gives the median results for two different technical authors, a 
sample of instructional material, and a sample of the Federalist Papers. The two authors 
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show similar styles, although author 2 uses somewhat shorter sentences and longer words than 
author 1. Author 1 uses all types of sentences, while author 2 prefers simple and complex sen- 
tences, using few compound or compound-complex sentences. The other major difference in 
the styles of these authors is the location of subordination. Author 1 seems to prefer embed- 
ded or trailing subordination, while author 2 begins many sentences with the subordinate 
clause. The documents tested for both authors 1 and 2 were technical documents, written for 
a technical audience. The instructional documents, which are written for craftspeople, vary 
surprisingly little from the two technical samples. The sentences and words are a little longer, 
and they contain many passive and auxiliary verbs, few adverbs, and almost no pronouns. 
The instructional documents contain many imperative sentences, so there are many sentence 
with verb openers. The sample of Federalist Papers contrasts with the other samples in 
almost every way. 


Table 2 
Text Statistics on Single Authors 


_____ variable author 1 author 2. inst. = FED_ 



































readability Kincaid 11.0 10.3 10.8 16.3 
automated 11.0 10.3 11.9 17.8 
Coleman-Liau 9.3 10.1 10.2 12.3 
te _______Flesch_ 10.3 _10.7 10,1 15.0 
sentence info av sent length 22.64 19.61 22.78 31.85 
av word length 4,47 4.66 4.65 4,95 
av nonfunction length 5.64 5.92 6.04 6.87 
short sent 35% 43% 35% 40% 
2 ___long sent 18% 165% 16% 21% 
sentence types simple 36% 43% 40% 31% 
complex 34% 41% 37% 34% 
compound 13% 7% 4% 10% 
__compound-complex 16% —==—S§s_—s 8% 14% 25% 
verb type tobe 42% 43% 45% 37% 
auxiliary 17% 19% 32% 32% 
infinitives 17% 15% 12% 21% 
__passives 20% 19% = 86% ~=~—— 20% 
word usage prepositions 10.0% 10.8% 12.3% 15.9% 
conjunctions 3.2% 2.4% 3.9% 3.4% 
adverbs 5.05% 4.6% 3.5% 3.7% 
nouns 27.7% 26.5% 29.1% 24.9% 
adjectives 17.0% 19.0% 15.4% 12.4% 
pronouns 5.3% 4.3% 2.1% 6.5% 
sentence openers _ prepositions 11% 14% 6% 5% 
adverbs 9% 9% 6% 4% 
subject 65% 59% 54% 66% 
verb 3% 2% 14% 2% 
subordinating conj 8% 14% 11% 3% 
conjunction 1% 0% 0% 3% 
expletives 3% 3% 0% 3% 


5.2. DICTION 

In the few weeks that DICTION has been available to users about 35,000 sentences have 
been run with about 5,000 string matches. The authors using the program seem to make the 
suggested changes about 50-75% of the time. To date, almost 200 of the 450 strings in the 
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default file have been matched. Although most of these phrases are valid and correct in some 
contexts, the 50-75% change rate seems to show that the phrases are used much more often 
than concise diction warrants. 


6. Accuracy 


6.1. Sentence Identification 


The correctness of the STYLE output on the 20 document sample was checked in detail. 
STYLE misidentified 129 sentence fragments as sentences and incorrectly joined two or more 
sentences 75 times in the 3287 sentence sample. The problems were usually because of non- 
standard formatting commands, unknown abbreviations, or lists of non-sentences. An impos- 
sibly long sentence found as the longest sentence in the document usually is the result of a 
long list of non-sentences. 


6.2. Sentence Types 


Style correctly identified sentence type on 86.5% of the sentences in the sample. The 
type distribution of the sentences was 52.5% simple, 29.9% complex, 8.5% compound and 9% 
compound-complex. The program reported 49.5% simple, 31.9% complex, 8% compound and 
10.4% compound-complex. Looking at the errors on the individual documents, the number of 
simple sentences was under-reported by about 4% and the complex and compound-complex 
were over-reported by 3% and 2%, respectively. The following matrix shows the programs 
output vs. the actual sentence type. 


Program Results 
simple complex compound comp-complex 


Actual simple 1566 132 49 17 
Sentence complex 47 892 6 65 
Type compound 40 6 207 23 
comp-complex 0 52 5 249 


The system’s inability to find imperative sentences seems to have little effect on most of 
the style statistics. A document with half of its sentences imperative was run, with and 
without the imperative end marker. The results were identical except for the expected errors 
of not finding verbs as sentence openers, not counting the imperative sentences, and a slight 
difference (1%) in the number of nouns and adjectives reported. 


6.3. Word Usage 


The accuracy of identifying word types reflects that of PARTS, which is about 95% 
correct. The largest source of confusion is between nouns and adjectives. The verb counts 
were checked on about 20 sentences from each document and found to be about 98% correct. 


7. Technical Details 


7.1. Finding Sentences 


The formatting commands embedded in the text increase the difficulty of finding sen- 
tences. Not all text in a document is in sentence form; there are headings, tables, equations 
and lists, for example. Headings like “Finding Sentences” above should be discarded, not 
attached to the next sentence. However, since many of the documents are formatted to be 
phototypeset, and contain font changes, which usually operate on the most important words in 
the document, discarding all formatting commands is not correct. To improve the programs’ 
ability to find sentence boundaries, the deformatting program, DEROFF [13], has been given 
some knowledge of the formatting packages used on the UNIX operating system. DEROFF 
will now do the following: 
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Suppress all formatting macros that are used for titles, headings, author’s name, etc. 
Suppress the arguments to the macros for titles, headings, author’s name, etc. 
Suppress displays, tables, footnotes and text that is centered or in no-fill mode. 


& oN 


Substitute a place holder for equations and check for hidden end markers. The place 
holder is necessary because many typists and authors use the equation setter to change 
fonts on important words. For this reason, header files containing the definition of the 
EQN delimiters must also be included as input to STYLE. End markers are often hid- 
den when an equation ends a sentence and the period is typed inside the EQN delim- 
iters. 


5. Adda ”.” after lists. If the flag —ml is also used, all lists are suppressed. This is a 
separate flag because of the variety of ways the list macros are used. Often, lists are sen- 
tences that should be included in the analysis. The user must determine how lists are 
used in the document to be analyzed. 


Both STYLE and DICTION call DEROFF before they look at the text. The user should 
supply the —ml flag if the document contains many lists of non-sentences that should be 
skipped. 


7.2. Details of DICTION 


The program DICTION is based on the string matching program FGREP. FGREP 
takes as input a file of patterns to be matched and a file to be searched and outputs each line 
that contains any of the patterns with no indication of which pattern was matched. The fol- 
lowing changes have been added to FGREP: 


1. The basic unit that DICTION operates on is a sentence rather than a line. Each sen- 
tence that contains one of the patterns is output. 


Upper case letters are mapped to lower case. 
Punctuation is replaced by blanks. . 
All pattern matches in the sentence are found and surrounded with “[” “]” . 


om wo 


A method for suppressing a string match has been added. Any pattern that begins with 
““” will not be matched. Because the matching algorithm finds the longest substring, the 
suppression of a match allows words in some corréct contexts not to be matched while 
allowing the word in another context to be fouind. For example, the word “which” is 
often incorrectly used instead of “that” in restrictive clauses. However, “which” is usu- 
ally correct when preceded by a preposition or “,”. The default pattern file suppresses 
the match of the common prepositions or a double blank followed: by “which” and there- 
fore matches only the suspect uses. The double blank accounts for the replaced comma. 


8. Conclusions 


A system of writing tools that measure some of the objective characteristics of writing 
style has been developed. The tools are sufficiently general that they may be applied to docu- 
ments on any subject with équal accuracy. Although the measurements are only of the surface 
structure of the text, they do point out problem areas. In addition to helping writers produce 
better documents, these programs may be useful for studying the writing process and finding 
other formulae for measuring readability. 
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Ch. 
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Dr. 
Drs. 


e. g. 


eq. 
et al. 
etc. 
Fig. 


Figs. 
figs. 
ft. 

i. e. 
in. 
Inc. 
dr. 
jr. 
mi. 
Mr. 
Mrs. 
Ms. 
No. 
no. 
Nos. 
nos. 
P. M. 
p. m. 


Ph. D. 


Ph. d. 
Ref. 
ref. 
Refs. 
refs. 
St. 


vs. 
yr. 
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Appendix 1 


STYLE Abbreviations 
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a great deal of 

a Jarge number of 
a lot of 

a majority of 

a need for 

a number of 

a particular preference for 
a preference for 

a small number of 
a tendency to 
abovementioned 
absolutely complete 
absolutely essential 
accomplished 
accordingly 
activate 

actual 

added increments 
adequate enough 
advent 

afford an opportunity 
aggregate 

all of 

all throughout 
along the line 

an indication of 
analyzation 

and etc 

and or 

another additional 
any and all 

arrive ata 

as a matter of fact 
as a method of 

as good or better than 
as of now 

as per 

as regards 

as related to 

as to 

assistance 
assistance to 
assistance to 
assuming that 

at a later date 

at about 

at above 

at all times 

at an early date 

at below 

at the present 

at the time when 

at this point in time 
at this time 

at which time 

at your earliest convenience 
authorization 

awful ce 

basic fundamentals 
basically 

be cognizant. of 
being as 

being that 

brief in duration 
bring to a conclusion 
but that 

but what 

by means of 

by the use of 

carry out experiments 
center about 

center around 


Appendix 2 


Default DICTION Patterns 


center portion 
check into 

check on 

check up on 

circle around 

close proximity 
collaborate together 
collect together 
combine together 
come to an end 
commence 

common accord 
compensation 
completely eliminated 
comprise 
concerning 


conduct an investigation of 


conjecture 

connect up 
consensus of opinion 
consequent result 
consolidate together 
construct 
contemplate 
continue on 
continue to remain 
could of 

count up 

couple together 
debate about 
decide on 
deleterious effect 
demean 
demonstrate 
depreciate in value 
deserving of 
desirable benefits 
desirous of 
different than 
discontinue 
disutility 

divide up 

doubt but 

due to 

duly noted 

during the time that 
each and every 
early beginnings 
effectuate 
emotional feelings 
empty out 
enclosed herein 
enclosed herewith 
end result 

end up 

endeavor 

enter in 

enter into 
enthused 

entirely complete 
equally good as 
essentially 
eventuate 

every now and then 
exactly identical 
experiencing difficulty 
fabricate 

face up to 
facilitate 

facts and figures 
fast in action 
fearful of 


fearful that 

few in number 

file away 

final completion 
final ending 

final outcome 

final result 

finalize 

find it interesting to know 
first. and foremost 
first beginnings 
first initiated 
firstly 

follow after 
following after 

for the purpose of 
for the reason that 
for the simple reason that 
for this reason 

for your information 
from the point of view of 
full and complete 
generally agreed 
good and 

got to 

gratuitous 

greatly minimize 
head up 

help but 

helps in the production of 
hopeful 

if and when 

if at all possible 
impact 

implement 
important essentials 
importantly 

in a large measure 
in a position to 

in accordance 

in advance of 

in agreement with 
in all cases 

in back of 

in behalf of 

in behind 

in between 

in case 

in close proximity 
in conflict with 

in conjunction with 
in connection with 
in fact 

in large measure 

in many cases 

in most cases 

in my opinion I think 
in order to 

in rare cases 

in reference to 

in regard to 

in regards to 

in relation with 

in short supply 

in size 

in terms of 

in the amount of 
in the case of 

in the course of 

in the event 

in the field of 


in the form of 

in the instance of 

in the interim 

in the last analysis 
in the matter of 

in the near future 

in the neighborhood of 
in the not too distant future 
in the proximity of 
in the range of 

in the same way as described 
in the shape of 

in the vicinity of 

in this case 

in view of the 

in violation of 
inasmuch as 

indicate 

indicative of 
initialize 

initiate 

injurious to 

inquire 

inside of 

institute a 

intents and purposes 
intermingle 
irregardless 

is defined as 

is used to control 

is when 

is where 

it is incumbent 

it stands to reason 

it was noted that if 
joint cooperation 
joint partnership 
just exactly 

kind of 

know about 

last but not least 
later on 

leaving out of consideration 
liable 

link up 

literally 

little doubt that 

lose out on 

lots of 

main essentials 
make a 

make adjustments to 
make an 

make application to 
make contact with 
make mention of 
make out a list of 
make the acquaintance of 
make the adjustment 
manner 

maximum possible 
meaningful 

meet up with 

melt down 

melt up 
methodology 

might of 

minimize as far as possible 
minor importance 
miss out on 
modification 


more preferable 
most unique 

must of 

mutual cooperation 
necessary requisite 
necessitate 

need for 

nice 

not be un 

not in a position to 
not of a high order of accuracy 
not un 
notwithstanding 

of considerable magnitude 
of that 

of the opinion that 
off of 

on a few occasions 
on account of 

on behalf of 

on the grounds that 
on the occasion 

on the part of 

one of the 

open up 

operates to correct 
outside of 

over with 

overall 

past history 
perceptive of 
perform a measurement 
perform the measurement 
permits the reduction of 
personalize 
pertaining to 
physical size 

plan ahead 

plan for the future 
plan in advance 
plan on 

present a conclusion 
present a report 
presently 

prior to 

prioritize 

proceed to 

procure 

productive of 
prolong the duration 
protrude out from 
provided that 
pursuant to 

put to use in 

range all the way from 
reason is because 
reason why 

recur again 

reduce down 

refer back 

reference to this 
reflective of 
regarding 

regretful 

reinitiate 

relative to 

repeal. again 
representative of 
resultant effect 
resume again 
retreat back 

return again 

return back 

revert back 

seal off 


seems apparent 

send a communication 
short space of time 
should of 

single unit 

situation 

so as to 

sort of 

spell out 

still continue 

still remain 

subsequent 
substantially in agreement 
succeed in 

suggestive of 

superior than 
surrounding circumstances 
take appropriate 

take cognizance of 
take into consideration 
termed as 

terminate 

termination 

the author 

the authors 

the case that 

the fact 

the foregoing 

the foreseeable future 
the fullest possible extent 
the majority of 

the nature 

the necessity of 

the only difference being that 
the order of 

the point that 

the truth is 

there are not many 
through the medium of 
through the use of 
throughout the entire 
time interval 

to summarize the above 
total effect of all this 
totality 

transpire 

true facts 

try and 

ultimate end 

under a separate cover 
under date of 

under separate cover 
under the necessity to 
underlying purpose 
undertake a study 
uniformly consistent 
unique 

until such time as 

up to this time 

upshot 

utilize 

very 

very complete 

very unique 

vital 

which 

with a view to 

with reference to 

with regard to 

with the exception of 
with the object of 

with the result that 
with this in mind, it is clear that 
within the realm of possibility 
without further delay 
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worth while 
would of 

ing behavior 
wise 

~ which 

~ about which 

~ after which 

~ at which 

~ between which 
~ by which 

~ for which 

~ from which 

~ in which 

~ into which 

~ of which 

~ on which 

~ on which 

~ over which 

~ through which 
~ to which 

~ under which 

~ upon which 

~ with which 

~ without which 
“clockwise 
“likewise 
“otherwise 
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PART 6: MISCELLANEOUS 


This part contains articles you may find helpful on unsupported software. 


Learn 


The article on Learn, by Kernighan and Lesk, tells how you can create and use computer- 
aided-instruction (CAI) courses. Read “LEARN - Computer-Aided Instruction on UNIX” if 
you plan to develop CAI courses. This article is not for people new to ULTRIX-32 or those 
who want help in using a CAI course that has already been developed. The Learn utility is 
available on ULTRIX-32, but it is not supported. 


Rogue 


When you feel comfortable with the ULTRIX-32 system, you may want to play Rogue. “A 
Guide to the Dungeons of Doom” is the first step on an adventure that will test your courage 
and intuition. With the help of the guide, you may be able to return from the dungeons of 
doom. Rogue and a variety of other games are available on the ULTRIX-32 system, but they 
are not supported. 


Berkeley Fonts 


The “Berkeley Font Catalogue” shows sample raster fonts developed at Berkeley. These fonts 
are available on the ULTRIX-32 system, but are not supported. 


PDP-11 Assembler 


The “UNIX Assembler Reference Manual” included in this part describes the assembly 
language for the UNIX system that runs on the PDP-11. The PDP-11 assembler is not avail- 
able on the ULTRIX-32 system. 
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LEARN — Computer-Aided Instruction on UNIX 
(Second Edition) 


Brian W. Kernighan 
Michael E. Lesk 


Bell Laboratories 
Murray Hill, New Jersey 07974 


1. Introduction. 


Learn is a driver for CAI scripts. It is intended to permit the easy composition of lessons 
and lesson fragments to teach people computer skills. Since it is teaching the same system on 
which it is implemented, it makes direct use of UNIxT facilities to create a controlled UNIX 
environment. The system includes two main parts: (1) a driver that interprets the lesson 
scripts; and (2) the lesson scripts themselves. At present there are six scripts: 


=— basic file handling commands 

— the UNIX text editor ed 

=— advanced file handling 

— the eqn language for typing mathematics 

— the **=ms”’ macro package for document formatting 

=- the C programming language 
The purported advantages of CAI scripts for training in computer skills include the follow- 
ing: 

(a) students are forced to perform the exercises that zre in fact the basis of training in 

any case; 
(b) students receive immediate feedback and confirmation of progress; 
(c) students may progress at their own rate; 


(d) no schedule requirements are imposed; students may study at any time convenient 
for them; 


(e) the lessons may be improved individually and the improvements are immediately 
available to new users; 


(f) since the student has access to a computer for the CAI script there is a place to do 
exercises; 

(g) the use of high technology will improve student motivation and the interest of their 
management. 


Opposed to this, of course, is the absence of anyone to whom the student may direct questions. 
If CAI is used without a ‘‘counselor’’ or other assistance, it should properly be compared to a 
textbook, lecture series, or taped course, rather than to a seminar. CAI has been used for 
many years in a variety of educational areas.!-2-3 The use of a computer to teach itself, how- 
ever, offers unique advantages. The skills developed to get through the script are exactly those 
needed to use the computer; there is no waste effort. 


TNS Sciipis wiitten so far aré based on some familiar assumptions cbout education: these 


TUNIX is a Trademark of Beil Laboratories. 
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assumptions are outlined in the next section. The remaining sections describe the operation of 
the script driver and the particular scripts now available. The driver puts few restrictions on the 
script writer. but the current scripts are of a rather rigid and stereotyped form in accordance 
with the theory in the next section and practical limitations. 


2. Educational Assumptions and Design. 


First. the way to teach people how to do something is to have them do it. Scripts should 
not contain long pieces of explanation: they should instead frequently ask the student to do 
some task. So teaching is always by example: the typical script fragment shows a small example 
of some technique and then asks the user to either ‘repeat that example or produce a variation 
on it. All are intended to be easy enough that most students will get most questions right, rein- 
forcing the desired behavior. 


Most lessons fail into one of three types. The simplest presents a lesson and asks for a 
yes or no answer to a question. The student is given a chance to experiment before replying. 
The script checks for the correct reply. Problems of this form are sparingly used. 


The second type asks for a word or number as an answer. For example a lesson on files 
might say 


How many files are there in the current directory? Type “answer N", where N is the number 
of files. 

The student is expected to respond (perhaps after experimenting) with 
answer [7 


or whatever. Surprisingly often, however, the idea of a substitutable argument (i.e., replacing 
N by 17) is difficult for non-programmer students, so the first few such lessons need reai care. 


The third type of lesson is open-ended — a task is set for the student, appropriate parts of 
the input or output are monitored, and the student types ready when the task is done. Figure | 
shows a sample dialog that illustrates the last of these, using two lessons about the cat (con- 
catenate, i.e.. print) command taken from early in the script that teaches file handling. Most 
learn lessons are of this form. 


After each correct response the computer congratulates the student and indicates the les- 
son number that has just been completed, permitting the student to restart the script after that 
lesson. If the answer is wrong, the student is offered a chance to repeat the lesson. The 
**speed”’ rating of the student (explained in section 5) is given after the lesson number when 
the lesson is completed successfully; it is printed only for the aid of script authors checking out 
possible errors in the lessons. 


It is assumed that there is no foolproof way to determine if the student truly ‘‘under- 
stands’ what he or she is doing. accordingly, the current /earn scripts only measure perfor- 
mance, not comprehension. If the student can perform a given task, that is deemed to be 
‘‘learning.’’¢ 


The main point of using the computer is that what the student does is checked for 
correctness immediately. Unlike many CAI scripts, however, these scripts provide few facilities 
for dealing with wrong answers. In practice, if most of the answers are not right the script is a 
failure: the universal solution to student error is to provide a new, easier script. Anticipating 
possible wrong answers is an endless job, and it is really easier as well as better to provide a 
simpler script. 


Along with this goes the assumption that anything can be taught to anybody if it can be 
broken into sufficiently smal! pieces. Anything not ahsorhed in a single chunk is just subdi- 
vided. 


To avoid boring the faster students, however, an effort is made in the files and editor 
scripts to provide three tracks of different difficulty. The fastest sequence of lessons is aimed at 
roughly the bulk and speed of a typical tutorial manual and should be adequate for review and 
for well-prepared students. The next track is intended for most users and is roughly twice as 


Learn 6-5 
Figure 1: Sample dialog from basic files script 


(Student responses in italics; ‘S’ is the prompt) 


A file can be printed on vour terminal 
by using the “cat” command. Just say 
"cat file” where “file” is the file name. 
For example, there is a file named 
"food" in this directory. List it 
by saying "cat food”; then type “ready”. 
S cat food 

this is the file 

named food. 
S ready 


Good. Lesson 3.3a (1) 


Of course, you can print any file with “cat”. 
In particular, it is common to first use 

"Is" to find the name of a file and then “cat” 
to print it. Note the difference between 
"Is", which tells you the name of the file, 
and "cat", which tells you, the contents. 


One file in the current directory is named for 
a President. Print the file, then type “ready”. 
S cat President 

cat: can’t open President 

S$ ready 


Sorry, that’s not right. Do you want to try again? yes 

Try the problem again. 

S is 

.ocopy 

X1 

roosevelt 

$ cat roosevelt 
this file is named roosevelt 
and contains three lines of 
text. 

$ ready 


Good. Lesson 3.36 (0) 
The “cat® command can also print several files 


at once. In fact, it is named “cat” as an abbreviation 
for “concatenate”... 





long. Typically, for example. the fast track might present an idea and ask for a variation on the 
example shown; tne normal track wiii first ask the siudent to repeat ine exampie tnat was 
shown before attempting a variation. The third and slowest track, which is often three or four 
times the length of the fast track. is intended to be adequate for anyone. (The lessons of Fig- 
ure | are from the third track.) The multiple tracks also mean that a student repeating a course 
is unlikely to hit the same series of lessons: this makes it profitable for a shaky user to back up 
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and try again, and many students have done so. 


The tracks are not completely distinct, however. Depending on the number of correct 
answers the student has given for the last few lessons, the program may switch tracks. The 
driver is actually capable of following an arbitrary directed graph of lesson sequences, as dis- 
cussed.in section 5. Some more structured arrangement, however, is used in all current scripts 
to aid the script writer in organizing the material into lessons. It is sufficiently difficult to write 
lessons that the three-track theory is not followed very closely except in the files and editor 
scripts. Accordingly, in some cases, the fast track is produced merely by skipping lessons from 
the slower track. In others, there is essentially only one track. 


The main reason for using the /earn program rather than simply writing the same material 
as a workbook is not the selection of tracks, but actual hands-on experience. Learning by doing 
is much more effective than pencil and paper exercises. 


Learn also provides a mechanical check on performance. The first version in fact would 
not let the student proceed unless it received correct answers to the questions it set and it 
would not tell a student the right answer. This somewhat Draconian approach has been 
moderated in version 2. Lessons are sometimes badly worded or even just plain wrong; in such 
cases, the student has no recourse. But if a student is simply unable to complete one lesson, 
that should not prevent access to the rest. Accordingly, the current version of /earn allows the 
student to skip a lesson that he cannot pass; a ‘‘no’’ answer to the ‘*Do you want to try again?”’ 
question in Figure | will pass to the next lesson. It is still true that /earn will not tell the stu- 
dent the right answer. 


Of course, there are valid objections to the assumptions above. In particular, some stu- 
dents may object to not understanding what they are doing; and the procedure of smashing 
everything into small pieces may provoke the retort ‘“tyou can't cross a ditch in two jumps.’ 
Since writing CAI scripts is considerably more tedious than ordinary manuals, however, it is 
safe to assume that there will always be alternatives. to the scripts as a way of learning. In fact, 
for a reference manual of 3 or 4 pages it would not be surprising to have a tutorial manual of 
20 pages and a (multi-track) script of 100 pages. Thus the reference manual will exist long 
before the scripts. 


3. Scripts. 


As mentioned above, the present scripts try at most to follow a three-track theory. Thus 
little of the potential complexity of the possible directed graph is ernployed, since care must be 
taken in lesson construction to see that every necessary fact is presented in every possible path 
through the units. In addition, it is desirable that every unit have alternate successors to deal 
with student errors. . 


In most existing courses, the first few lessons are devoted to checking prerequisites. For 
example, before the student is allowed to proceed through the editor script the script verifies 
that the student understands files and is able to type. It is felt that the sooner lack of student 
preparation is detected, the easier it will be on the student. Anyone proceeding through the 
scripts should be getting mostly correct answers; otherwise, the system will be unsatisfactory 
both because the wrong habits are being learned and because the scripts make little effort to 
deal with wrong answers. Unprepared students should not be encouraged to continue with 
scripts. 


There are some preliminary items which the student must know before any scripts can be 
tried. In particular, the student must know how to connect to a UNIX system, set the terminal 
properly, log in, and execute simple commands (e.g., /earn itself). In addition, the character 
erase and line kill conventions (# and @) should be known. It is hard to see how this much 
could be taught by computer-aided instruction, since a student who does not know these basic 
skills will not be able to run the learning program. A brief description on paper is provided 
(see Appendix A), although assistance will be needed for the first few minutes. This assis- 
tance, however, need not be highly skilled. 
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The first script in the current set deals with files. It assumes the basic knowledge above 
and teaches the student about the /s, car, mv, rm, cp and diff commands. It also deals with 
the abbreviation characters *, ?, and [] in file names. It does not cover pipes or I/O redirec- 
tion, nor does it present the many options on the /s command. 


This script contains 31 lessons in the fast track: two are intended as prerequisite checks, 
seven are review exercises. There are a total of 75 lessons in all three tracks, and the instruc- 
tional passages typed at the student to begin each lesson total 4,476 words. The average lesson 
thus begins with a §0-word message. In general, the fast track lessons have somewhat longer 
introductions, and the slow tracks somewhat shorter ones. The longest message is 144 words 
and the shortest 14. 


The second script trains students in the use of the context editor ed, a sophisticated editor 
using regular expressions for searching.> All editor features except encryption, mark names and 
*.’ in addressing are covered. The fast track contains 2 prerequisite checks, 93 lessons, and a 
review lesson. It is supplemented by 146 additional lessons in other tracks. 


A comparison of sizes may be of interest. The ed description in the reference manual is 
2,572 words long. The ed tutorial® is 6,138 words long. The fast track through the ed script is 
7,407 words of explanatory messages, and the total ed script, 242 lessons, has 15,615 words. 
The average ed lesson is thus also about 60 words; the largest is 171 words and the smailest 10. 
The original ed script represents about three man-weeks of effort. 


The advanced file handling script deals with /s options, I/O diversion, pipes, and support- 
ing programs like pr, we, tail, spell and grep. (The basic file handling script is a prerequisite.) 
It is not as refined as the first two scripts: this is reflected at least partly in the fact that it pro- 
vides much less of a full three-track sequence than they do. On the other hand, since it is per- 
ceived as ‘“‘advanced,”’ it is hoped that the student will have somewhat more sophistication and 
be better able to cope with it at a reasonably high level of performance. 


A fourth script covers the eqn language for typing mathematics. This script must be run 
on a terminal capable of printing mathematics. for instance the DASI 300 and similar Diablo- 
based terminals, or the nearly extinct Mode! 37 teletype. Again, this script is relatively short of 
tracks: of 76 lessons, only 17 are in the second track and 2 in the third track. Most of these 
provide additional practice for students who are having trouble in the first track. 


The —ms script for formatting macros is a short one-track only script. The macro pack- 
age it describes is no longer the standard, so this script will undoubtedly be superseded in the 
future. Furthermore, the linear style of a single learn script is somewhat inappropriate for the 
macros, since the macro package is composed of many independent features, and few users 
need all of them. It would be better to have a selection of short lesson sequences dealing with 
the features independently. 


The script on C is in a state of transition. It was originally designed to follow a tutorial on 
C, but that document has since become obsolete. The current script has been partially con- 
verted to follow the order of presentation in The C Programming Language,’ but this job is not 
complete. The C script was never intended to teach C; rather it is supposed to be a series of 
exercises for which the computer provides checking and (upon success) a suggested solution. 


This combination of scripts covers much of the material which any user will need to know 
to make effective use of the UNIX system. With enlargement of the advanced files course to 
include more on the command interpreter, there will be a relatively complete introduction to 
UNIX available via learn. Although we make no pretense that /earn will replace other instruc- 
tional materials, it should provide a useful supplement to existing tutorials and reference manu- 
als. 
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4. Experience with Students. 


Learn has been installed on many different UNIX systems. Most of the usage is on the 
first two scripts, so these are more thoroughly debugged and polished. As a (random) sample 
of user experience, the /earn program has been used at Bell Labs at Indian Hill for 10,500 les- 
sons in a four month period. About 3600 of these are in the files script, 4100 in the editor, and 
1400 in advanced files. The passing rate is about 80%, that is, about 4 lessons are passed for 
everv one failed. There have been 86 distinct users of the files script. and 58 of the editor. On 
our system at Murray Hill, there have been nearly 4000 lessons over four weeks that include 
Christmas and New Year. Users have ranged in age from six up. 


It is difficult to characterize typical sessions with the scripts; many instances exist of some- 
one doing one or two lessons and then logging out, as do instances of someone pausing in a 
script for twenty minutes or more. In the earlier version of learn, the average session in the 
files course took 32 minutes and covered 23 lessons. The distribution is quite broad and 
skewed, however, the longest session was 130 minutes and there were five sessions shorter 
than five minutes. The average lesson took about 80 seconds. These numbers are roughly typ- 
ical for non-programmers; a UNIX expert can do the scripts at approximately 30 seconds per les- 
son, most of which is the system printing. 


At present working through a section of the middle of the files script took about 1.4 
seconds of processor time per lesson, and a system expert typing quickly took 15 seconds of 
real time per lesson. A novice would probably take at least a minute. Thus, as a rough approx- 
imation, a UNIX system could support ten students working simultaneously with some spare 
capacity. 


5. The Script Interpreter. 


The /earn program itself merely interprets scripts. It provides facilities for the script writer 
to capture student responses and their effects, and simplifies the job of passing control to and 
recovering control from the student. This section describes the operation and usage of the 
driver program, and indicates what is required to produce a new script. Readers only interested 
in the existing scripts may skip this section. 


The file structure used by /earn is shown in Figure 2. There is one parent directory 
(named /ib) containing the script data. Within this directory are subdirectories, one for each 
subject in which a course is available, one for logging (named /og), and one in which user sub- 
directories are created (named play). The subject directory contains master copies of all les- 
sons, plus any supporting material for that subject. In a given subdirectory, each lesson is a 
single text file. Lessons are usually named systematically, the file that contains lesson 7 is 
called Lr. 


When learn is executed, it makes a private directory for the user to work in, within the 
learn portion of the file system. A fresh copy of all the files used in each lesson (mostly data 
for the student to operate upon) is made each time a student starts a lesson, so the script writer 
may assume that everything is reinitialized each time a lesson is entered. The student directory 
is deleted after each session; any permanent records must be kept elsewhere. 


The script writer must provide certain basic items in each lesson: 

(1) the text of the lesson: 

(2) the set-up commands to be executed before the user gets control; 

(3) the data, if any, which the user is supposed to edit, transform, or otherwise process: 

(4) the evaluating commands to be executed after the user has finished the lesson, to decide 
whether the answer is right; and 

(5) alist of possible successor lessons. 

Learn tries to minimize the work of bookkeeping and installation, so that most of the effort 


involved in script production is in planning lessons, writing tutorial paragraphs, and coding tests 
of student performance. 


Learn 6-9 


Figure 2: Directory structure for learn 


student! 

files for studentl... 
student2 

files for student2... 


LO. la lessons for files course 
L0.1b 


eee 


editor 


(other courses) 





log 


The basic sequence of events is as follows. First, learn creates the working directory. 
Then, for each lesson, learn reads the script for the lesson and processes it a line at a time. 
The lines in the script are: (1) commands to the script interpreter to print something, to create 
a files, to test something, ete.; (2) text to be printed or put in a file; (3) other lines, which are 
sent to the shell to be executed. One line in each lesson turns control over to the user: the 
user can run any UNIX commands. The user mode terminates when the user types yes. 70, 
ready, of answer. At this point, the user's work is tested; if the lesson is passed, a new lesson 
is selected, and if not the old one is repeated. 


Let us illustrate this with the script for the second lesson of Figure 1; this is shown in 
Figure 3. 
Lines which begin with # are commands to the /earn script interpreter. For example, 
Hprine 
causes printing of any text that follows, up to the next line that begins with a sharp. 
#print file 
prints the contents of file; it is the same as car file but has less overhead. Both forms of #prinr 


have the added property that if a lesson is failed, the #princ will not be executed the second 
time through; this avoids annoying the student by repeating the preamble to a lesson. 


#create filename 
creates a file of the specified name, and copies any subsequent text up to a # to the file. This 
is used for creating and initializing working files and reference data for the lessons. 

Huser 
gives control to the student; each line he or she types is passed to the shell for execution. The 


#user mode is terminated when the student types one of yes, no, ready or answer. At that 
time, the driver resumes interpretation of the script. 


teconyin 
#uncopyin 


Anything the student types between these commands is copied onto a file called .copv. This lets 
the script writer interrogate the student's responses upon regaining control. 
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Figure 3: Sample Lesson 


#print 
Of course, you can print any file with “cat”. 
In particular, it is common to first use 
"Is" to find the name of a file and then “cat” 
to print it. Note the difference between 
"Is", which tells you the name of the files, 
and “cat", which tells you the contents. 
One file in the current directory is named for 
a President. Print the file, then type “ready”. 
#create roosevelt 

this file is named roosevelt 

and contains three lines of 

text. 
#copyout 
#user 
#uncopyout 
tail —3 .ocopy >X1 
#cmp X1 roosevelt 





#log 
#next 
3.26 2 
#copyout 
#uncopyout 


Between these commands, any material typed at the student by any program is copied to the file 
.ocopy. This lets the script writer interrogate the effect of what the student typed, which true 
believers in the performance theory of learning usually prefer to the student’s actual input. 

Hpi 

#unpipe 
Normally the student input and the script commands are fed to the UNIX command interpreter 
(the ‘‘shell’’) one line at a time. This won't do if, for example, a sequence of editor commands 
is provided, since the input to the editor must be handed to the editor, not to the shell. 
Accordingly, the material between #pipe and #unpipe commands is fed continuously through a 
pipe so that such sequences work. If copyour is also desired the copyour brackets must include 
the pipe brackets. 


There are several commands for setting status after the student has attempted the lesson. 
#cemp filel file2 
is an in-line implementation of cmp, which compares two files for identity. 
#match stuff 
The last line of the student’s input is compared to stuff, and the success or fail status is set 
according to it. Extraneous things like the word answer are stripped before the comparison is 
made. There may be several #march lines, this provides a convenient mechanism for handling 


multiple *‘right’’ answers. Any text up to a # on subsequent lines after a successful #mazrch is 
printed; this is illustrated in Figure 4, another sample lesson. 


#bad stuff 


This is similar to #march, except that it corresponds to specific failure answers: this can be 
used to produce hints for particular wrong answers that have been anticipated by the scvipt 
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Figure 4: Another Sample Lesson 


#print 

What command will move the current line 
to the end of the file? Type 

"answer COMMAND”, where COMMAND is the command. 
#copyin 

#user 

#uncopyin 

#match mS 

#match .m$ 

"mS" is easier. 

#log 

next 

63.1d 10 





writer, 
#succeed 
Hail 
print a message upon success or failure (as determined by some previous mechanism). 


When the student types one of the ‘“‘commands’”’ yes, no, ready, of answer, the driver 
terminates the #user command, and evaluation of the student’s work can begin. This can be 
done either by the built-in commands above, such as #march and #cmp, or by status returned 
by normal UNIX commands, typically grep and test. The last command should return status true 
(0) if the task was done successfully and false (non-zero) otherwise; this status return tells the 
driver whether or not the student has successfully passed the lesson. 


Performance can be logged: 
#log file 


writes the date, lesson, user name and speed rating, and a success/failure indication on file. 
The command . 


#log 


by itself writes the logging information in the logging directory within the learn hierarchy, and 
is the normal form. 


#Hnext 


is followed by a few lines, each with a successor lesson name and an optional speed rating on it. 
A typical set might read 


25.1la 10 
25.2a § 
25.3a 2 


indicating that unit 25.la is a suitable follow-on lesson for students with a speed rating of 10 
units, 25.2a for student with speed near 5, and 25.3a for speed near 2. Speed ratings are main- 
tained for each session with a student; the rating is increased by one each time the student gets 
a lesson right and decreased by four each time the student gets a lesson wrong. Thus the driver 
tries to maintain a level such that the users get 80% right answers. The maximum rating is lim- 
ited ic 10 and the minimum to 0. The initia! rating is zero unless the student specifies a 
different rating when starting a session. 

If the student passes a lesson, a new lesson is selected and the process repeats. If the stu- 
dent fails, a false status is returned and the program reverts to the previous lesson and tries 
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another alternative. If it can not find another alternative, it skips forward a lesson. The stu- 
dent can terminate a session at any time by typing bye, which causes a graceful exit from /earn. 
Hanging up is the usual novice’s way out. 


The lessons may form an arbitrary directed graph, although the present program imposes 
a limitation on cycles in that it will not present a lesson twice in the same session. If the stu- 
dent is unable to answer one of the exercises correctly, the driver searches for a previous lesson 
with a set of alternatives as successors (following the #zexr line). From the previous lesson 
with alternatives one route was taken earlier, the program simply tries a different one. 


It is perfectly possible to write sophisticated scripts that evaluate the student’s speed of 
response, or try to estimate the elegance of the answer, or provide detailed analysis of wrong 
answers. Lesson writing is so tedious already, however, that most of these abilities are likely to 
go unused. 


The driver program depends heavily on features of the UNIX system that are not available 
on many other operating systems. These include the ease of manipulating files and directories, 
file redirection, the ability to use the command interpreter as just another program (even in a 
pipeline), command status testing and branching, the ability to catch signals like interrupts, and 
of course the pipeline mechanism itself. Although some parts of learn might be transferable to 
other systems, some generality will probably be lost. 


A bit of history: The first version of learn had fewer built-in commands in the driver pro- 
gram, and made more use of the facilities of the UNIX system itself. For example, file com- 
parison was done by creating a cmp process, rather than comparing the two files within /earn. 
Lessons were not stored as text files, but as archives. There was no concept of the in-line 
document; even #print had to be followed by a file name. Thus the initialization for each les- 
son was to extract the archive into the working directory (typically 4-8 files), then #print the 
lesson text. 


The combination of such things made /earn rather slow and demanding of system 
resources. The new version is about 4 or 5 times faster, because fewer files and processes are 
created. Furthermore, it appears even faster to the user because in a typical lesson, the printing 
of the message comes first, and file setup with #creare can be overlapped with printing, so that 
when the program finishes printing, it is really ready for the user to type at it. 


It is also a great advantage to the script maintainer that lessons are now just ordinary text 
files, rather than archives. They can be edited without any difficulty, and UNIx text manipula- 
tion tools can be applied to them. The result has been that there is much less resistance to 
going in and fixing substandard lessons. 


6. Conclusions 


The following observations can be made about secretaries, typists, and other non- 
programmers who have used /earn: 


(a) A novice must have assistance with the mechanics of communicating with the computer 


to get through to the first lesson or two; once the first few lessons are passed people can 
proceed on their own. 


(b) The terminology used in the first few lessons is obscure to those inexperienced with com- 
puters. It would heip if there were a low level reference card for UNIX to supplement the 
existing programmer oriented buiky manual and bulky reference card. 

(c) The concept of ‘substitutable argument”’ is hard to grasp, and requires help. 

(ad) They enjoy the system for the most part. Motivation matters a great deal, however. 

It takes an hour or two for a novice to get through the script on file handling. The total time 

for a reasonably intelligent and motivated novice to proceed from ignorance to a reasonable 


ability to create new files and manipulate old ones seems to be a few days, with perhaps half of 
each day spent on the machine. 
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The normal way of proceeding has been to have students in the same room with someone 
who knows the UNIX system and the scripts. Thus the student is not brought to a halt by 
difficult questions. The burden on the counselor, however, is much lower than that on a 
teacher of a course. Ideally, the students should be encouraged to proceed with instruction 
immediately prior to their actual use of the computer. They should exercise the scripts on the 
same computer and the same kind of terminal that they will later use for their real work, and 
their first few jobs for the computer should be relatively easy ones. Also, both training and ini- 
tial work should take place on days when the hardware and software are working reliably. 
Rarely is all of this possible, but the closer one comes the better the result. For example, if it 
is known that the hardware is shaky one day, it is better to attempt to reschedule training for 
another one. Students are very frustrated by machine downtime; when nothing is happening, it 
takes some sophistication and experience to distinguish an infinite loop, a slow but functioning 
program, a program waiting for the user, and a broken machine. 


One disadvantage of training with /earn is that students come to depend completely on the 
CAI system, and do not try to read manuals or use other learning aids. This is unfortunate, not 
only because of the increased demands for completeness and accuracy of the scripts, but 
because the scripts do not cover all of the UNIX system. New users should have manuals 
(appropriate for their level) and read them; the scripts ought to be altered to recommend suit- 
able documents and urge students to read them. 


There are several other difficulties which are clearly evident. From the student's 
viewpoint, the most serious is that lessons still crop up which simply can't be passed. Some- 
times this is due to poor explanations, but just as often it is some error in the lesson itself — a 
botched setup, a missing file, an invalid test for correctness, or some system facility that 
doesn't work on the local system in the same way it did on the development system. It takes 
knowledge and a certain healthy arrogance on the part of the user to recognize that the fault is 
not his or hers, but the script writer's. Permitting the student to get on with the next lesson 
regardless does alleviate this somewhat, and the logging facilities make it easy to watch for les- 
sons that no one can pass, but it is still a problem. 


The biggest problem with the previous learn was speed (or lack thereof) — it-was often 
excruciatingly slow and a significant drain on the system. The current version so far does not 
seem to have that difficulty, although some seripts, notably eq, are intrinsically slow. eqn, for 
example, must do a lot of work even to print its introductions, let alone check the student 
responses, burt delay is perceptible in all scripts from time to time. 


Another potential problem is that it is possible to break /earn inadvertently, by pushing 
interrupt at the wrong time, or by removing critical files, or any number of similar slips. The 
defenses against such problems have steadily been improved, to the point where most students 
should not notice difficulties. Of course, it will always be possible to break /earn maliciously, 
but this is not likely to be a problem. 


One area is more fundamental — some commands are sufficiently global in their effect 
that learn currently does not allow them to be executed at all. The most obvious is cd, which 
changes to another directory. The prospect of a student who is learning about directories inad- 
vertently moving to some random directory and removing files has deterred us from even writ- 
ing lessons on cd, but ultimately lessons on such topics probably should be added. 
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* We have even known an expert programmer to decide the computer was broken when he had simply left 
his terminal in local mode. Novices have great difficulties with such problems. 
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University of California 
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1. Introduction 


You have just finished your years as a student at the local fighter’s guild. After much 
practice and sweat you have finally completed your training and are ready to embark upon a 
perilous adventure. As a test of your skills, the local guildmasters have sent you into the 
Dungeons of Doom. Your task is to return with the Amulet of Yendor. Your reward for the 
completion of this task will be a full membership in the local guild. In addition, you are 
allowed to keep all the loot you bring back from the dungeons. 


In preparation for your journey, you are given an enchanted mace, a bow, and a quiver 
of arrows taken from a dragon’s hoard in the far off Dark Mountains. You are also outfitted 
with elf-crafted armor and given enough food to reach the dungeons. You say goodbye to 
family and friends for what may be the last time and head up the road. 


You set out on your way to the dungeons and after several days of uneventful travel, you 
see the ancient ruins that mark the entrance to the Dungeons of Doom. It is late at night, so 
you make camp at the entrance and spend the night sleeping under the open skies. In the 
morning you gather your weapons, put on your armor, eat what is almost your last food, and 
enter the dungeons. 


2. What is going on here? 


You have just begun a game of rogue. Your goal is to grab as much treasure as you can, 
find the Amulet of Yendor, and get out of the Dungeons of Doom alive. On the screen, a map 
of where you have been and what you have seen on the current dungeon level is kept. As you 
explore more of the level, it appears on the screen in front of you. 


Rogue differs from most computer fantasy games in that it is screen oriented. Com- 
mands are all one or two keystrokes! and the results of your commands are displayed graphi- 
cally on the screen rather than being explained in words.” 

Another major difference between rogue and other computer fantasy games is that once 
you have solved all the puzzles in a standard fantasy game, it has lost most of its excitement 


and it ceases to be fun. Rogue, on the other hand, generates a new dungeon every time you 
play it and even the author finds it an entertaining and exciting game. 


' As opposed to pseudo English sentences. 


2 A minimum screen size of 24 lines by 80 columns is required. If the screen is larger, only the 24x80 section 
will be used for the map. 
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3. What do all those things on the screen mean? 


In order to understand what is going on in rogue you have to first get some grasp of what 
rogue is doing with the screen. The rogue screen is intended to replace the “You can see ...” 
descriptions of standard fantasy games. Figure 1 is a sample of what a rogue screen might 
look like. 


3.1. The bottom line 


At the bottom line of the screen are a few pieces of cryptic information describing your 
current status. Here is an explanation of what these things mean: 


Level This number indicates how deep you have gone in the dungeon. It starts at one and 
goes up as you go deeper into the dungeon. 


Gold The number of gold pieces you have managed to find and keep with you so far. 


Hp _—* Your current and maximum hit points. Hit points indicate how much damage you can 
take before you die. The more you get hit in a fight, the lower they get. You can 
regain hit points by resting. The number in parentheses is the maximum number your 
hit points can reach. 


Str Your current strength and maximum ever strength. This can be any integer less than 
or equal to 31, or greater than or equal to three. The higher the number, the stronger 
you are. The number in the parentheses is the maximum strength you have attained so 
far this game. 


Ac Your current armor class. This number indicates how effective your armor is in stop- 
ping blows from unfriendly creatures. The lower this number is, the more effective the 
armor. 


Exp These two numbers give your current experience level and experience points. As you 
do things, you gain experience points. At certain experience point totals, you gain an 
experience level. The more experienced you are, the better you are able to fight and to 
withstand magical attacks. 


3.2. The top line 


The top line of the screen is reserved for printing messages that describe things that are 
impossible to represent visually. If you see a “--More--” on the top line, this means that 
rogue wants to print another message on the screen, but it wants to make certain that you 


Level: 1 Gold: 0 Hp: 12(12) Str: 16(16) Ac: 6 Exp: 1/0 


Figure 1 
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have read the one that is there first. To read the next message, just type a space. 


3.3. The rest of the screen 


The rest of the screen is the map of the level as you have explored it so far. Each sym- 
bol on the screen represents something. Here is a list of what the various symbols mean: 


@ This symbol represents you, the adventurer. 
| -| These symbols represent the walls of rooms. 
+ A door to/from a room. 

The floor of a room. 


# The floor of a passage between rooms. 


* 


A pile or pot of gold. 


) A weapon of some sort. 

] A piece of armor. 

A flask containing a magic potion. 

Ng A piece of paper, usually a magic scroll. 
= A ring with magic properties 

/ A magical staff or wand 


A trap, watch out for these. 
% A staircase to other levels 
A piece of food. 


A-Z The uppercase letters represent the various inhabitants of the Dungeons of Doom. 
Watch out, they can be nasty and vicious. 


4. Commands 


Commands are given to rogue by typing one or two characters. Most commands can be 
preceded by a count to repeat them (e.g. typing “10s” will do ten searches). Commands for 
which counts make no sense have the count ignored. To cancel a count or a prefix, type 
<ESCAPE>. The list of commands is rather long, but it can be read at any time during the 
game with the “?” command. Here it is for reference, with a short explanation of each com- 
mand. 


q The help command. Asks for a character to give help on. If you type a “*”, it will list 
all the commands, otherwise it will explain what the character you typed does. 


/ This is the “What is that on the screen?” command. A “/” followed by any character 
that you see on the level, will tell you what that character is. For instance, typing “/@” 
will tell you that the “@” symbol represents you, the player. 


h, H, “H 
Move left. You move one space to the left. If you use upper case “h”, you will continue 
to move left until you run into something. This works for all movement commands (e.g. 
“TL,” means run in direction “1’’) If you use the “control” “h”, you will continue moving 
in the specified direction until you pass something interesting or run into a wall. You 
should experiment with this, since it is a very useful command, but very difficult to 
describe. This also works for all movement commands. 


j Move down. 
k Move up. 


6-20 A Guide to the Dungeons of Doom 


— 


rT Bp oe Ss 


fo 0 4 OD mH #: 


wHs 


Move right. 

Move diagonally up and left. 
Move diagonally up and right. 
Move diagonally down and left. 
Move diagonally down and right. 


Throw an object. This is a prefix command. When followed with a direction it throws 
an object in the specified direction. (e.g. type “th” to throw something to the left.) 


Fight until someone dies. When followed with a direction this will force you to fight the 
creature in that direction until either you or it bites the big one. 


Move onto something without picking it up. This will move you one space in the direc- 
tion you specify and, if there is an object there you can pick up, it won’t do it. 


Zap prefix. Point a staff or wand in a given direction and fire it. Even non-directional 
staves must be pointed in some direction to be used. 


Identify trap command. If a trap is on your map and you can’t remember what type it 
is, you can get rogue to remind you by getting next to it and typing “*” followed by the 
direction that would move you on top of it. 


Search for traps and secret doors. Examine each space immediately adjacent to you for 
the existence of a trap or secret door. There is a large chance that even if there is some- 
thing there, you won’t find it, so you might have to search a while before you find some- 
thing. 


Climb down a staircase to the next level. Not surprisingly, this can only be done if you 
are standing on staircase. 


Climb up a staircase to the level above. This can’t be done without the Amulet of Yen- 
dor in your possession. 


Rest. This is the “do nothing” command. This is good for waiting and healing. 
Inventory. List what you are carrying in your pack. 

Selective inventory. Tells you what a single item in your pack is. 

Quaff one of the potions you are carrying. 

Read one of the scrolls in your pack. 

Eat food from your pack. 


Wield a weapon. Take a weapon out of your pack and carry it for use in combat, replac- 
ing the one you are currently using (if any). 


Wear armor. You can only wear one suit of armor at a time. This takes extra time. 
Take armor off. You can’t remove armor that is cursed. This takes extra time. 


Put on a ring. You can wear only two rings at a time (one on each hand). If you aren’t 
wearing any rings, this command will ask you which hand you want to wear it on, other- 
wise, it will place it on the unused hand. The program assumes that you wield your 
sword in your right hand. 


Remove a ring. If you are only wearing one ring, this command takes it off. If you are 
wearing two, it will ask you which one you wish to remove, 


Drop an object. Take something out of your pack and leave it lying on the floor. Only 
one object can occupy each space. You cannot drop a cursed object at all if you are 
wielding or wearing it. 


Call an object something. If you have a type of object in your pack which you wish to 
remember something about, you can use the call command to give a name to that type of 
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object. This is usually used when you figure out what a potion, scroll, ring, or staff is 
after you pick it up, or when you want to remember which of those swords in your pack 
you were wielding. 


D__s«~wPrint out which things you’ve discovered something about. This command will ask you 
what type of thing you are interested in. If you type the character for a given type of 
object (e.g. “!” for potion) it will tell you which kinds of that type of object you’ve 
discovered (i.e., figured out what they are). This command works for potions, scrolls, 
rings, and staves and wands. 


oO Examine and set options. This command is further explained in the section on options. 
“R__ Redraws the screen. Useful if spurious messages or transmission errors have messed up 
the display. 


“P_ Print last message. Useful when a message disappears before you can read it. This only 
repeats the last message that was not a mistyped command so that you don’t loose any- 
thing by accidentally typing the wrong character instead of “P. 


<ESCAPE> 
Cancel a command, prefix, or count. 


! Escape to a shell for some commands. 
Q Quit. Leave the game. 


S Save the current game in a file. It will ask you whether you wish to use the default save 
file. Caveat: Rogue won’t let you start up a copy of a saved game, and it removes the 
save file as soon as you start up a restored game. This is to prevent people from saving a 
game just before a dangerous position and then restarting it if they die. To restore a 
saved game, give the file name as an argument to rogue. As in 
% rogue save file 


To restart from the default save file (see below), run 
% rogue —r 


Prints the program version number. 


Print the weapon you are currently wielding 


— ww < 


Print the armor you are currently wearing 


Print the rings you are currently wearing 


@ _ Reprint the status line on the message line 


5. Rooms 


Rooms in the dungeons are either lit or dark. If you walk into a lit room, the entire 
room will be drawn on the screen-as soon as you enter. If you walk into a dark room, it will 
only be displayed as you explore it. Upon leaving a room, all monsters inside the room are 
erased from the screen. In the darkness you can only see one space in all directions around 
you. A corridor is always dark. 


6. Fighting 


If you see a monster and you wish to fight it, just attempt to run into it. Many times a 
monster you find will mind its own business unless you attack it. It is often the case that dis- 
cretion is the better part of valor. 


7. Objects you can find 


When you find something in the dungeon, it is common to want to pick the object up. 
This is accomplished in rogue by walking over the object (unless you use the “‘m” prefix, see 
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above). If you are carrying too many things, the program will tell you and it won’t pick up 
the object, otherwise it will add it to your pack and tell you what you just picked up. 


Many of the commands that operate on objects must prompt you to find out which 
object you want to use. If you change your mind and don’t want to do that command after 
all, just type an <ESCAPE> and the command will be aborted. 


Some objects, like armor and weapons, are easily differentiated. Others, like scrolls and 
potions, are given labels which vary according to type. During a game, any two of the same 
kind of object with the same label are the same type. However, the labels will vary from game 
to game. 


When you use one of these labeled objects, if its effect is obvious, rogue will remember 
_ what it is for you. If it’s effect isn’t extremely obvious you will be asked what you want to 
scribble on it so you will recognize it later, or you can use the “call” command (see above). 


7.1. Weapons 


Some weapons, like arrows, come in bunches, but most come one at a time. In order to 
use a weapon, you must wield it. To fire an arrow out of a bow, you must first wield the bow, 
then throw the arrow. You can only wield one weapon at a time, but you can’t change 
weapons if the one you are currently wielding is cursed. The commands to use weapons are 
“w”? (wield) and “t” (throw). 


7.2. Armor 


There are various sorts of armor lying around in the dungeon. Some of it is enchanted, 
some is cursed, and some is just normal. Different armor types have different armor classes. 
The lower the armor class, the more protection the armor affords against the blows of mon- 
sters. Here is a list of the various armor types and their normal armor class: 









Type Class 
None 10 
Leather armor 8 
Studded leather / Ring mail if 
Scale mail 6 
Chain mail 5 
Banded mail / Splint mail 4 


If a piece of armor is enchanted, its armor class will be lower than normal. If a suit of armor 
is cursed, its armor class will be higher, and you will not be able to remove it. However, not 
all armor with a class that is higher than normal is cursed. 


The commands to use weapons are “W” (wear) and “T” (take off). 
7.3. Scrolls 


Scrolls come with titles in an unknown tongue’. After you read a scroll, it disappears 
from your pack. The command to use a scroll is “r” (read). 


3 


7.4. Potions 
Potions are labeled by the color of the liquid inside the flask. They disappear after 
being quaffed. The command to use a scroll is “q’’ (quaff). 


3 Actually, it’s a dialect spoken only by the twenty-seven members of a tribe in Outer Mongolia, but you’re not 
supposed to know that. 
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7.5. Staves and Wands 


Staves and wands do the same kinds of things. Staves are identified by a type of wood; 
wands by a type of metal or bone. They are generally things you want to do to something 
over a long distance, so you must point them at what you wish to affect to use them. Some 
staves are not affected by the direction they are pointed, though. Staves come with multiple 
magic charges, the number being random, and when they are used up, the staff is just a piece 
of wood or metal. 


The command to use a wand or staff is “z’’ (zap) 


7.6. Rings 


Rings are very useful items, since they are relatively permanent magic, unlike the usually 
fleeting effects of potions, scrolls, and staves. Of course, the bad rings are also more powerful. 
Most rings also cause you to use up food more rapidly, the rate varying with the type of ring. 
Rings are differentiated by their stone settings. The commands to use rings are “P” (put on) 
and “R” (remove). 


7.7. Food 


Food is necessary to keep you going. If you go too long without eating you will faint, 
and eventually die of starvation. The command to use food is “e” (eat). 


8. Options 


Due to variations in personal tastes and conceptions of the way rogue should do things, 
there are a set of options you can set that cause rogue to behave in various different ways. 


8.1. Setting the options 


There are two ways to set the options. The first is with the “o” command of rogue; the 
second is with the “ROGUEOPTS” environment variable’. 


8.1.1. Using the ‘o’ command 


When you type “o” in rogue, it clears the screen and displays the current settings for all 
the options. It then places the cursor by the value of the first option and waits for you to 
type. You can type a <RETURN> which means to go to the next option, a “—” which means to 
go to the previous option, an <ESCAPE> which means to return to the game, or you can give 
the option a value. For boolean options this merely involves typing “t” for true or “f” for 
false. For string options, type the new value followed by a <RETURN>. 


8.1.2. Using the ROGUEOPTS variable 


The ROGUEOPTS variable is a string containing a comma separated list of initial values 
for the various options. Boolean variables can be turned on by listing their name or turned off 
by putting a “no” in front of the name. Thus to set up an environment variable so that jump 
is on, terse is off, and the name is set to “Blue Meanie”, use the command 


% setenv ROGUEOPTS ”jump,noterse,zname=Blue Meanie”® 


4 On Version 6 systems, there is no equivalent of the ROGUEOPTS feature. 


° For those of you who use the bourne shell, the commands would be 
$ ROGUEOPTS=”jump,noterse,.name=Blue Meanie” 
$ export ROGUEOPTS 
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8.2. Option list 


Here is a list of the options and an explanation of what each one is for. The default 
value for each is enclosed in square brackets. For character string options, input over fifty 
characters will be ignored. 


terse [noterse| 
Useful for those who are tired of the sometimes lengthy messages of rogue. This is a 
useful option for playing on slow terminals, so this option defaults to terse if you are on 
a slow (1200 baud or under) terminal. 


jump [nojump] 
If this option is set, running moves will not be displayed until you reach the end of the 
move. This saves considerable cpu and display time. This option defaults to jump if 
you are using a slow terminal. 


flush [noflush] 
All typeahead is thrown away after each round of battle. This is useful for those who 
type far ahead and then watch in dismay as a Bat kills them. 


seefloor [seefloor] 
Display the floor around you on the screen as you move through dark rooms. Due to the 
amount of characters generated, this option defaults to noseefloor if you are using a slow 
terminal. 


passgo [nopassgo] 
Follow turnings in passageways. If you run in a passage and you run into stone or a 
wall, rogue will see if it can turn to the right or left. If it can only turn one way, it will 
turn that way. If it can turn either or neither, it will stop. This is followed strictly, 
which can sometimes lead to slightly confusing occurrences (which is why it defaults to 
nopassgo). 


tombstone [tombstone] 
Print out the tombstone at the end if you get killed. This is nice but slow, so you can 
turn it off if you like. 


inven [overwrite | 
Inventory type. This can have one of three values: overwrite, slow, or clear. With 
overwrite the top lines of the map are overwritten with the list when inventory is 
requested or when “Which item do you wish to...? ” questions are answered with a “*”, 
However, if the list is longer than a screenful, the screen is cleared. With slow, lists are 
displayed one item at a time on the top of the screen, and with clear, the screen is 
cleared, the list is displayed, and then the dungeon level is re-displayed. Due to speed 
considerations, clear is the default for terminals without clear-to-end-of-line capabilities. 


name [account name] 
This is the name of your character. It is used if you get on the top ten scorer’s list. 


fruit [slime-mold] 
This should hold the name of a fruit that you enjoy eating. It is basically a whimsey 
that rogue uses in a couple of places. 


file [~/rogue.save] 
The default file name for saving the game. If your phone is hung up by accident, rogue 
will automatically save the game in this file. The file name may start with the special 
character ‘““” which expands to be your home directory. 


9. Scoring 


Rogue usually maintains a list of the top scoring people or scores on your machine. 
Depending on how it is set up, it can post either the top scores or the top players. In the 
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latter case, each account on the machine can post only one non-winning score on this list. If 
you score higher than someone else on this list, or better your previous score on the list, you 
will be inserted in the proper place under your current name. How many scores are kept can 
also be set up by whoever installs it on your machine. 


If you quit the game, you get out with all of your gold intact. If, however, you get killed 
in the Dungeons of Doom, your body is forwarded to your next-of-kin, along with 90% of your 
gold; ten percent of your gold is kept by the Dungeons’ wizard as a fee®. This should make 
you consider whether you want to take one last hit at that monster and possibly live, or quit 
and thus stop with whatever you have. If you quit, you do get all your gold, but if you swing 
and live, you might find more. 


If you just want to see what the current top players/games list is, you can type 
% rogue —s 


10. 
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Berkeley Font Catalogue 


Introduction 


This catalog gives samples of the various fonts available at Berkeley using 
vtroff on our Versatec and Varian. We have them working 4 pages across in a 36 
inch Versatec, and rotated 90 degrees on a Benson-Varian 11 inch plotter. The 
same software should be adaptable to an 11 inch Versatec, and in fact is running 
at several other sites, however, not having one here, it isn’t part of this distribu- 
tion. Such a driver is available from Tom Ferrin at UCSF. 


To use these fonts: 


(1) Hershey. This i# the default font. The Hershey font is currently the only 
complete font, with all 16 point sizes and all the special characters troff 
knows about. To get it, use vtroff directly. To illustrate this with the -ms 
macro package: 


vtroff —ms paper.nr 
(2) Fonts with roman, italic, and bold, such as nonie. You can load all three 
fonts with, for example: 
vtroff —F nonie —ms paper.nr 
To get just one of these fonts, use (3) below, appending -r, .i, or .b to the 


font name to specify which font you want mounted, e.g., to get italics in 
delegate, 


vtroff —2 delegate.i —ms paper.nr 


(3) To get a font without a complete set, choose which font (1, 2, or 3) you want 
replaced by the chosen font. For example, to use bocklin as though it were 
bold, since font 3 is bold, use: 


vtroff ~3 bocklin —ms paper.nr 


To switch between fonts in troff, use 
{3 
to switch to font 3, for example, or use 
\fSword\ fi 


to switch within a line. For more information see the Nroff/Troff Users Manual. 


Special note: troff thinks it is talking to a CAT phototypesetter. Thus, it 
does all sorts of strange things, such as enforcing restrictions like 7.54 inches 
maximum width, 4 fonts, a certain 16 point sizes, proportional spacing by point 
size, etc. 


In particular, the following glyphs will always be taken from the special 
font, no matter what font you are using at the time: 


@, #, a i i <, \. f, }, a aoe and — 


This may explain what are otherwise surprising results in some of the subse- 
quent pages. 


In addition, the following Greek letters have been decreed by troff as look- 
ing so much like their Roman counterparts that the Roman version (font 1) is 
always printed, no matter what font is mounted on font 1 at the time: 


Ay By etal de RM, NOP Ty x 


(See table I] in the back of the Nroff/Troff Users's Manual for details about what 
glyphs are in each font and how to generate the special glyphs.) 
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Code 
000 
001 


—— 


R 


Font Layout Positions 
[Code Normal _—sSpecial Code Normal Special | Special Code Normal 

100 
fi \fl | = NG |} 101 A 
fi \(a | 2 \Gp |} 102 B 
ff \(@@ | < (pt |} 103 Cc 
- - = \(th |} 104 D 
= \@u | uo N\(eu | 105 E 
_ \(em | ~ (rn }} 108 F 
e @ \s 107 G 
° (sq | + VC 110 H 
NGQD | s N\(<2 qo 111 I 
\Q@L | 2 \(>=]} 112 J 
5s N(de | Vo N\(sr jj 113 K 
t \ddg | s \(ts | 114 L 
: \@m if  \is 115 M 
ba Vico | 7/7 3 Nsl 118 N 
bd \@g i | (ev i] 117 0 
8 \let | | NV 120 P 
¥ \Qi4 | Jo \(rt |} 122 Q 
% \Qi2 1 [ \(le 122 R 
xX \G4 | J \(re 123 s 
f NQt 124 T 
( Nb yf 125 U 
1 \@t |} 128 V 
}) Nb i} 127 W 
{ \Q&k } 130 xX 
}  \Gk | 131 Y 
c N\(sbd 132 Z 
> V\sp || 133 [ 

AN N(ca |] 134 
- \@ao | 135 J 

~= \h || 138 
€ ‘(mo |} 137 | es 

space 140 
! 141 | a 
. 142 b 
# 143 e 
$ 144 d 
4 145 e 
& 146 ft 
° 147 g 
( v Ne i} 150 h 
) 151 i 
° x ‘(mu || 152 j 
+ + \(l 153 k 
' 154 l 
: - N\@mi | 155 m 
5 156 n 
/ + \(di 157 ° 
0 = \(== || 180 > 
1 «2 \(~= 161 q 
2 ~ Nap 162 r 
3 *» \(t= 163 s 
4 - N\(< || 164 t 
5 - \-> 165 u 
6 ? (aa) 166 | Ov 
7 + \Gde | 167 w 
8 § (sc | 170 x 
9 e (re | 171 y 
: 172 z 

H 173 
< 174 | 

= 175 

> 176 





0 
Pp. 


\(°A 


1OCN HOEK SH AMUHON AES ROSIN D IW > Ol, 
K 
3 





I~—-—“SWESXECAQKDAOMWEE ~He GIVWAGONDWA -| 
Z 
“- 
é 
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APL FONT, 10 POINT ONLY 


AcBiCnDiEe F_GuHoln]-K' LOM INTOoPs Q?ReSIT~UL VUW XDYTZe 01234 56789 
("PS Exe>_ VA CH eH LEELA ~ LVN AITOSr<+/\.>,< 


La(Ar>2=Bwx’ a’ (avons rg erate ae saul wf] a}l e+ 3' 4+ 


,a Se we? w \ 


Baskerville font, roman, ibold, italic, 12 point only (Called “basker” on line.) 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 01234 56789 
VP" H#EL& ()ra-e ll] EP ~LNIG';+/7.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, the greates: 
prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time 
enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by 
diligence shall we do more with less perplexity. 


ABCDE FGHI] KLMNO PQRST UVWXYZ abcde fghij kimno parst uvwxyz 01234 56789 

PV ESTE ()ee-ef PEI ~~ NIG sei? >,< 

If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest 
prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time 
enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by 
diligence shall we do more with less perplexity. 

ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno parst uvwxyz 01234 56789 

PU ESTE (lie a LTE Sm ~oNI G's 4/2. >, < 

If time be of all things the most precious, wasting time must be, as Poor Richard says, the 
greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we 


cail time enough, always proves little enough: Let us then up and be doing, and doing to the 


purpose; so by diligence shall we do more with less perplexity. 
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Bocklin font, 14 and 28 point only. 


14 point 


ABCDE FGHH KLOIRO PORST UVWKXY& abcde Ighij kimno paqrst uvwxyz 
01234 56789 


Pee at ef 2 og 


Hf time be of all things the most precious, wasting time must be, as Poor 
Richard says, the greatest prodigality: since, as he elsewhere fells us, 
lost time is never found again; and what we call time enough, always 
proves littic enough: Let us then up and be doing, and doing to the 
purpose; so by diligence shall we do more with less perplexity. 


28 point (No punctuation except period.) 


ABCDE FGRiy KLTNO PORST 
YVWXYZ abcde ighij klmno parst 
uvwxyz 01234 56789. 


iH time be of all things the most 
precious wasting time musf be as 
Poor Richard says the greatest 
prodigality since as he elsewhere 
tells us lost time is never found 
again and what we call time enough 
always proves little enough Let us 
then up and be doing and doing to 
the purpose so by diligence shall we 
do more with less perplexity. 
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Bodoni font, roman, bold, italic, 10 point only. 


ABCDE FGHIJ KLMNO PORST UV WXYZ abcde fghij klmno parst uvwxyz 01234 56789 


PD" PSS ()re-a LIT EP re ~NI G's +/?.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest 
prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, 
always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we 


do more with less perplexity. 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno parst uswxyz 01234 56789 


I"#SES (CO): B-@L] [P+ N[ Ose 7?.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest 
prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time 
enough, always proves little enou gh: Let us then up and be doing, and doing to the purpose; so by 
diligence shall we do more with less perplexity. 


ABCDE FGHI] KELMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwryz 01234 56789 

I" PSZ&*():e-2 [J LI r~w~N1[ O';+/?.>,< 

If time be of ail things the most precious, wasting time must be, as Poor Richard says, the greatest 
prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, 


always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence chall we 
do more with less perplexity. 
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Chess, 18 point only 


Note: Our attempt at compatibility with Stanford was only 99% successful. If you use 
a blank space to indicate an empty white square it will come out narrow due to the 
stupidity of troff. Either include the line | 


.cs ch 38 
to put yourself in constant spacing mode or else use zero instead of space. You 


should also set the vertical spacing to 18 points. 


ft ch Sy eS 

.cs ch 36 BY BG 
.ps 18 ang ae —, 
.vs 18 i] ic] 
HITTTITTTTX sates o 

VOZOZOAOZF 263 “eG 
VZOZOZOOOFr i So i 
V0o0Z0Z0ZF 

VZ0Z0Z0ZOF a a 
VOMOZOZOZF iy, UKs 
Vj PZOZOZOF ar ‘eA: 
VOZKZOZOZF A a 
VZ0Z0ZOZOF co = 
WUUUUUUUUG tig: yeti 
.Sp a: Ly 
-ft P 

-ps 8 

.cs P Mt 


|eSeeQgt 
E 
eV ge 
BP fe 
Nae 


IN 
A SAAN 


Qs 
A AY 


YY 
4 
A97 


ON OWCH Fe our sew oor 
NGSsyounrounwzetzrpwort 
fE 





White mates in three moves. 
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Clarendon, 14 and 18 point roman only. From SAIL (Paul Martin & Andy 
Moorer) 


ABCDE FGHIJ KLMNO PQRST UVWY abcde fghij klmno pqr. 
uvwxyz 01234 56789 


"£S re C2 -s[L] SJ rr~N\][O'3+/2?.>,< 


If time be of all things the most precious, wasting time must be, 
as Poor Richard says, the greatest prodigality; since, as he 
elsewhere tells us, lost time is never found again; and what we 
call time enough, always proves little enough: Let us then up 
and be doing, and doing to the purpose; so by diligence shall we 
do more with less perplexity. 


ABCDE FGHIJ KLMNO PQRST UVWXY abcde 
fghij klmno paqrst uvwxyz 01234 56789 


"HSE Tx? C2 -=L] $3} > r~_\| @'3+/?.>,< 


If time be of all things the most precious, wasting 
time must be, as Poor Richard says, the greatest 
prodigality; since, as he elsewhere tells us, lost 
time is never found again; and what we call time 
enough, always proves little enough: Let us then 
up and be doing, and doing to the purpose} so by 
diligence shall we do more with less perplexity. 
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Computer Modern fonts,roman, ttaltc,and bold.(by Don Knuth) 6,7,3,9, 10,11, 12 polat. (Available as cm) 


Note that the cm fonts are intended for TEX and don’t fare so well with trofZ The specing is not propor- 
tienal by point sise,and hence only one point size can be tuned to be nicely spaced. We have tuned the 10 point 
size, but the 8 point looks somewhat cramped, 


Somes of the punctuation is missing in some of the fontse Knuth also uses a nonstandard notion of ASCTI, 
and hence some gyphs are available only with special symbols such as \(12 Others cannot be accessed at all 


Kauth’s fonts somewhat largr than normal,since he intends the cutput to be reduced before printing 
Since troff has a limitation of 754 inches width on output, thie is not practical. Hence, the original fonts have 
been relabelled with the poins sise they are clanest to without reduction. Same fonts (6 paint bald,7 point roman, 
8 point italic and bold,9 point bold,and 11 point italic) which would have otherwise been missing were generated 


by shrinking the next larger point size af the sexe style. (This goes against the idea of metafont, but we use the 
tools we have} 


10 Point Roman 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxrys 01234 
56789!" gPB'() *- ($-~ iN @ $.>,<',,2,758,7,0,0,,,,4,0,A, 9,0, 


If time be of all things the most precious,wasting time must be,as Poor Richard says,the 
greatest prodigality since,as he elsewhere tells us,lost time is never found again and 
what we call time enough,always proves little enough Let us then up and be doing,and 
doing to the purpose so by diligence shall we do more with less perplexity. 


10 Poin? Italte 


ABCDE FGHIJ KLMNO PQRST UWXYZ abede fght7 kimno parst uuways 01234 
56789!" gph W’():F¥-e [Jf lo~nwO';,¢/?.>,<1,5,2,5-54,7,9, 0, 
n, 9,» A, ©, A, ¥, 2, a, 2, 7, 86 


If time be of all things the most precious, wasting time must de, as Poor Richard says, 
the greatest prodigality; since, as he elsewhere tells us, lost time is never found again, 
and what we call time enough, always proves little enough: Let us then up and be 
doing, and doing to the purpose; so by diligence shall we do more with less perplexity. 


10 Point Bold 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abede fghij kimno paqrst uvwxys 01234 


S67sO!" et BS’ ()st*- se []J fpr w~_NAG's + / 2. > <'s ‘y Deny ey BT, %, 
*9 79 7s A, 8, A, B, 2,1, 9, 5 3” 


If time be of all things the most precious, wasting time must be, as Poor Richard says, 
the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; 
and what we call time enough, always proves little enough: Let us then up and be doing, 
and doing to the purpose; so by diligence shail we do more with less perplexity. 


6 Potot Rocran, Bold,and Jali 

7 Point Roman, Bold, and Raita 

8 Point Roman, Bold, and Jtalia 

9 Point Roman,Bold,and Italic 

10 Point Roman,Bold,and Jiaite. 

11 Point Roman, Bold,and Jtaite. 


12 Point Roman,Bold,and Jtalzc. 
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Countdown (22 point, upper case letters only.) From SAIL (Paul Martin) 


HGOE FHL ALTHO POARST LULA 


GEUTTOOLUY HAS IC NITEGEARS 10 GOUT 
OOO WTR BUT TT GOTIPEISATES BY 
BENIG UGLY Ald TLLEGEELE 


Cyrillic, 12 point only 


KUs3 a6ze prxa xxuno aper ysis 


 tume Oe oc ann TXUHIC TXe MOCT mpeHOyc aCTuMr TuMe MyCT Ge ac Cop uxapy Ccafic Txe rpeatect 
npoguranuTah cuHe ac xe encexepe TerNC yc ACCT THKe WC Hesep PoyHN arauh ah xaT e ann TuMe eHOyrxX 
anafic aposec ANTTNe eHoyrx er yc Txex ya anx Ge NOMHY aHX NOUHY TO TxXe nypnoce co 6% muAurene cxann ¢ 
Xo Mope uTx Jecc cepnazenth 


WK Xo Ys Z3 ana b>6 doz eve fg gor hox imu k>x loa m>u nu 0-0 
p-a rp soc tor Uy Vos yor Z>8 


Delegate, roman, italic, and bold, 12 point only 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmo pqrst uvwxyz 01234 56789 
1"#STE' Chr aes LIL SP r>~_N1@%3+/7.5,< 


If time be of all things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elsewhere tells us, lost time is 
never found again; and what we call time enough, always proves little enough: Let 
us then up and be doing, and doing to the purpose; so by diligence shall we do more 
with less perplexity. 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij Rlmno parst uvwxyz 01234 56789 
L"ESKa OC) Fee L JL Y~LNIG 5 4#/7.>,< 

If time be of all things the most precious, wasting time must be, as Poor Richard says, 
the greatest prodigality; since, as he elsewhere tells us, lost time is never found 


again; and what we call time enough, always proves little enough: Let us then up and be 
doing, and doing to the purpose; so by diligence shall we do more wita less perplexity. 


6-36 Berkeley Font Catalogue 
ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 


P'HStTSEt CVs e-a lI Siraw~_Ni@et;+/7.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elsewhere tells us, lost time is 
never found again; and what we call time enough, always proves little enough: Let 


us then up and be doing, and doing to the purpose; so by diligence shall we do more 
with less perplexity. 


Fix fixed width font, &, 9, 18, 12, 14 point 


6 neint 

AGCDE FGHIJ KLANG PORST LAAIXY ehone fghij klane parst uveryz 012734 55788 

rrezsesrCrveweae CJ) al sos. De € 

Tf ties we ef all things the sest precieus. nesting tise aust be. os Poor Richerd says. the greatest srodigelity! since. os he elamchere 


tells us. lost tise is never found egeins end whet «ae cal] tise encugn, slxeys proves little encugh: Let us than up end be deing. and dein 
te the purpeses se by diligence shell wa do sere with less sarplexity. 


9 point 

ABCOE FGHIJ KLANG PQRST UVUXY abcde fghij Kimne paqarst uvuxyz 91234 58783 

Pr" SSIS? > CPOs eea lI SP rrwuvNfp@r';+477.35, < 

Tf time be of all things the most precious, nasting time must be, as Poor Richard says, the 
greatest prodigality; since, as he elisenhere tells us, lost time is never feund again; and 


what we call time enough, aiways proves littie enough: Let us then up and be doing, and doin: 
to the purpose; se by diligence shali we do more with less perpiexi ty. 


18 point 
ABCDE FGHIJ KLMNO PORST UVWXY abcde fghij kimno parst uvwxyz 91234 56789 
LY ee se’ Cd) ema we CIS Prw~Lnvn]| O33 477.3, < 


If time be of all things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elseuhere talis us, lost time is never 
found agains and whet we cal! time enough, always proves |ittie enough: Let us then 
up and be doing, amd doing to the purpose; so by diligence shai! we do more with les 
perplexity. 
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12 point 


ABCDOE FGHIJ KLMNO PORST UVUXY abcde fghij kimno parst uvwxyz 81234 
56789 


Lv gs 2h > C)raee-e LI Ep rr~-N1 O's + [?.>.< 


1# time be of al! things the most precious, wasting time must be, as 
Poor Richard says, the greatest prodigality; since. as he elsewhere 
tells us, lost time is never found again; and what we call time 
enough, always proves little enough: Let us then up and be doing, and 
dalng to the purpose; so by diligence ehall we do more with less 
perplexity. __. 


14 point 


ABCDE FGHIJ KLMNG PORST UVWXY abede fghij kimno porst 
uvixyz 812394 56789 


fe eS%8?> C)se-=L]lfyrr~_v\]@ts4+/?.5> 
ck | 


If time be of all things the most precious, wasting time 
must be, as Poor Richard says, the greatest prodigal ity; 
since, as he elsewhere tells us, lost time is never found 
agains and what we call time enough, always proves little 
enough: Let us then up and be doing, and doing to the 
purpose; so by diligence shall we do more With less 
perplexity. 
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Gacham, roman, bold, jita7vic, 18 point only 

The gacham font is almost indistinguishable from the fix font. In fact, it has been 
pointed out that our gacham roman and bold fonts really are fix. Sigh. They are in- 
cluded anyway for convenience. 


ABCDE FGHIJ KLIINO PORST UVWXYZ abcde fghij kimno parst uvuxyz 81234 $6789 


1" se$2&’ 0) se-e CI] SP ar~Llv] @t'34+/7.3,6€ 

If time be of all things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elsewhere tells us, lost time is never 
found again; and what we call time enough, always proves little enough: Let us then 
up and be doing, and doing to the purpose; so by diligence shal! we do more with less 


perplexity. 


ABCDE FGHIJ KLMNO PQRST UUWWXYZ abcde fghij kImno parst uvwxyz 01234 56789 
1" 8#SRa’ (CPs *e- eT JES rr~_NIO'‘5 4+/7.5>,6¢ 


If time be of all things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elsewhere te77s us, Tost time is never 
found again; and what we cal] time enough, always proves little enough: Let us then 


up and be doing, and doing to the purpose; so by diligence shal] we do more with less 
perplexity. | 


ABCDE FGHIJ KLMNO PORST UVUXYZ abcde fghij kimno porst uvioqz 81234 56789 
Peg. ee 4) pee eL) fpPe~w.N [O's 477.5, < 


If time be of al! things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elsewhere tel!s us, lost time is never 
found again; and what we call time enough, always proves little enough: Let us then 
up and be doing, and daing to the purposes; so by diligence shal! we do more with les: 
perplexity. 

Greek, 10 point only 


This font provides an alternative to the Greek characters on the standard special 
font. 


ARCDE FGHIJ KLMNO PQRST UVWXYZ abcde  fghij kimno i parst uvwxy 
ABXSE @fHIe KAMNO MEPIT TRQEYZ = afxse oynd rye wéper vawtp? 
I@ rips Sa og aA ryerye Tre poer reexiove wartiry Tine aveT fe ar Toop Pixsapd cave ree ypsarect 
weediyabsty ext ar Rs sloswnepe TINS ve hoot fie le tee Goud eyes ard wHaT we XEA\ Tis (Vee 


Gueave rpotee Urrhs evovys Aer we reer ve avd fe bers ow Soury ro rae ruprocs co Sy bidsyerxs ead 
we 80 page weey ecw ripe ety 
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The h1S font includes a subset of the h139’s graphic charactor set, plus a 
few logical extensions to allow forms and diagrams to be drawn. The charactors 
are the same as the h13’s graphic interpratation set. 


‘“abedefstuvaomnrAahi ki ii 


Lege * eee bap eels 
The charactors are designed to overlap. 


Example of usage for diagrams: 














MCaesgeg DESIGN MODULE: 
* 16-bit CPU 

* 32K bytes RAM 

* &K bytes monitor ROM 
x Parallel! Ports 

* 16-bit timers 









ofa 





238 
microcomputer 
system 


68K bytes RAM 


64K bytes RAM 


SOE 06 08 000 6 0S 08988 6 OS 08 06 08 08 O98 5 BETS OS BOSD OS 0 OOD FOC8S 6 OD OS 6 DEP 99 08 OO 08 C8 CEES 05 C88 6 08 68 008 OOS SS 6 SOOCF OF F 0.08 0058 SE OF 06 CH OF 0808 68 C808 8 608 08 0895 08 OF C288 TEA TOTES 
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Hebrew, 16, 24, and 38 point only 


16 point 


sosy Sanzy 5o3 spawe san 93 91 290) ae nan 01234 56789 
rH ag()s LILI Aw~LNIS"sy7 27.>,K< 


ony wmatin Sue te a2 ynoySoyo omimey Sh onShss anysmoptey. Te see ory 
NTMIONAY NS spt Nowe WTYNINSY. ESA Te 8) 997949 BN Be. TBE UTE 
) 31) 9Ck Wes). | 


24 point 


B207 6 OY 35m: SON@O Ih Q 


LY. 


en owastn ooo owes mo o>oswem op Uf Ao 
DP 30M ww. SANYO NNER! 5 s xyoour 
soxcaxs® | 


36 point (rather ragged) 


Sos nyo awe mt 


3 
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10 point Hershey 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 01234 56789 !, $, 
ay &, - G ), an we [. ], te /, ?, . 


\em 73 — 2-,\- 2, \(bu 2¢, \sq os, (rum . \(14 7 ¥, \(12 2 %, \(34 > ¥, Mit > 
fi, \(fl + fl, \(ff > ff, \(Fi + ff, \(Fl + 8, Vide ~ *, \(dg > ¢, \(fm >, \(ct > &N\(rg > 2 
\(co 2 © 

When you flex your fingers in a coffin, it can baffle a giraffe. 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uuwryz 01234 56789 !, 
$, %, &, ‘i 6D, " “4 an Peer yay ae 


\(em 2 = — 25, \e 2, \(bu we, \(sq ae, Vira 2 2 \(14 2 YN(12 ~ KN(34 > ENE > 
\(fl 2 fl, NUif 2 ff, \(Fi > ff, \UF1 > ft, \(de 7°, N(dg > ft, \(fm + ', \(ct ~ &\(rg > 8N\(co 
0 


When you flez your fingers tn a coffin, if can baffle a giraffe. 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno parst uvwxyz 01234 56789 !, $, 
ye a Oe oe os fede ats | Pe Par 


\(em 2 4 — #5, \- 2, \(bu 24 \(sq 78, \(ru > ~, \(14 ~ KA(12 = BN(34 = FN fi > fh, 
\(fl > fl, \(ff = ff \(Fi > ff, \(Fl ff, \(de 7 *, \(dg > f, \Uim >", \(et + $\(rg 2 8\(co 
+68 

When you flex your fingers in a coffin, it can baffle a giraffe. 

From special font:"#={}{-~oN|@‘’+>< 


Special characters: \(pl + +, \(mi 7 -, \(eq > =, \(** + *, \(se > 8, \(aa > %, \(ga > *, 
\(al » -, \(sl + 7, \(%a 2 a, \(% > 8, N(% > y, NOMd 2 6, \O%e 2» &, N09 2, NO8Y > 7, 
\(th 7 8, \(41 > 6, NO*%K 2 NCSL A, NCOm 2 a, \C%m 2 vu, \(%0 2 €, \(%0 7 0, \(*p > 7, 
\(tr > p, \O%s 2, N(ts 2 9, N(%% 2 7, \(MU 2 0, NCE > 9, \(P8X x, N08 2 YW, NCPW 2 ©, 
\(*A 2 A, NCB 2 B, ACG oT, \(*D 2 A, NCPE @ EL \(8Z% 9 Z, \(PY ~ A, N\(8H 2 8, \C*L 2 1, 
\(*K 2 K, \C8L 2 A, \(*M > HM, N\C*PN @ NN, N(8C 2 &, \(70 7 0, \(*P 2 TT, \(*R @ P, NC 8S 2 £, 
\(T 2 T, NCSU 2 T, \CSF > 8, N(8X = X, N09 > 8, NCOW 2 0, \(sr 2 V, Non 9 7 \(> = 2 2. 
\(<= 2 Ss, \(== 7 &, \(~= 2 &, V(ap 2 ~, \(l= 7 #4, \(-> 2 9, \(<- > ©, \(ua 2 4, \(da > 
+, \(mu > x, \(di 2 =, \(+- > +, V(eu > u, \(ca an, \(sb 2 C, \(sp > 3, \(ib 7 ¢, \(ip > 
2. \if + ©, \(pd > a, \(gr 2 9, \(no > -, \(is + J, \(pt = =, \(eq > =, \(n0 = -, \(br >|, 
\(dd > ¢. \(rh > #A(1h 2 =| \(bs -Q@ \(or + |, \ct #0, M(t >), Wb 2 \(rt 2 (, \(rb 2 J, 
\(k 74, \(rk +), \Cbv + |, \(lf o | \Crt 2], \de 2 f, \ire 7] 


If time be of all things the most precious, wasting time must be, as Poor Richard says, 

the greatest prodigality; since, as he elsewhere tells us, lost time is never found again: 
and what we call time enough, always proves little enough: Let us then up and be doing, 
and doing to the purpose; so by diligence shall we do more with less perplexity. 


This is an example of a sample in various fonts. 
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Hershey font. This is the default font for vtroff. Roman, /talic and Bold in 8, 7, 8, 9, 10, 
11, 12, 14, 16, 18, 20, 22, 24, 28, and 36 point. The following examples are 10 point. 


If time be of all things the most precious, wasting time must be, as Poor Richard says, 

the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; 
and what we call time enough, always proves little enough: Let us then up and be doing, 
and doing to the purpose; so by diligence shall we do more with less perplexity. 


as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time 
is never found again; and what we call time enough, always proves little enough: Let 
us then up and be doing, and doing to the purpose; so by diligence shall we do more 
with less perplexity. 


If time be of all things the most precious, wasting time must be, as Poor Richard says, 
the greatest prodigality; since, as he elsewhere tells us, lost time is never found 
again; and what we call time enough, always proves little enough: Let us then up and 
be doing, and doing to the purpose; so by diligence shall we do more with less 
perplexity. 


6 petnt Roman, Bela and /taitc. 

7 point Roman, Bold, and Jtalic. 

8 point Roman, Bold, and /talic. 

9 point Roman, Bold, and [talic. 

10 point Roman, Bold, and /talic. 

11 point Roman, Bold, and /talic. 


12 point Roman, Bold, and J/talic. 
14 point Roman, Bold, and /talic. 


16 point Roman, Bold, and /¢4lic. 
18 point Roman, Bold, and /talic. 
20 point Roman, Bold, and /talic. 


22 point Roman, Bold, and /talic. 
24 point Roman, Bold, and /talic. 


28 point Roman, Bold, and 
[talre. 


36 point Roman, Bold, 
and /taltc. 
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ABCDE FGHIJ KLMNO POQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 
"HSK S'()s*®- BL] §3r-~N1 O's +/7.5,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, ‘. 
greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and 
what we call time enough, always proves little enough: Let us then up and be doing, a: 
doing to the purpose; so by diligence shall we do more with less perplexity. 


ABCDE FGHIJ KLMNO PQORST UVWXYZ abcde fghij Xlmno pqrst uvwxyz 01234 567& 
L"H#SKR&'()s*m-eLJi~r~N] O'34/7.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard say. 
the greatest prodigality; since, as he elsewhere tells us, lost time is never found 
again; and what we Call time enough, always proves little enough: Let us then up and 
be doing, and doing to the purpose; so by diligence shall we do more with less 
perplexity. 


ABCDE FGHIJ KLMNO POQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 
56789 


eS ee (rB-2L]6j-~\]O's4/2.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard 
says, the greatest prodigality; since, as he elsewhere tells us, lost time is never 
found again; and what we call time enough, always proves little enough: Let us 
then up and be doing, and doing to the purpose; so by diligence shall we do mors 
with less perplexity. 
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Microgramma font, 10 point only 


ABCOE FGHIJ KLMNO PQRST UVWXY abede fghij kimno parst uvwxyz 01234 5678S 

I" FStLE'(C)sS-aL]fPr~L_nl[O's+/P.>,< 

If time be of all things the most precious, wasting time must be, as Poor Richerd says, th: 
greatest procigality; since, as he eisewhers tails us, lost time is never found again; and w” 


we Call time enough, always proves little enough: Let us then up and be doing, and doing ::. 
the purpose; so by diligence shall we do more with less perplexity. 


Flona font. 24 point only 


ABCDE FOF13 ALFANO PORST UD WXY = 
abcde fghij kimno pgrst Dowxyz 01234 56789 


I" #eeR' CI: - §inn_\ @;: 29. 
> mS 


Philadelphia is the most pecksniffian of American 
cities, and thos probably leads the world. 
- J. L. Plenchken 
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8 point 
ABC DE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno parst uvwxyz 01234 56789 


I" #SKZE'()s Bes LJ Epr~inlO's+/7.>,< 

If time be of ail things the most precious, wasting time must be, as Poor Richard says, the greatest prodigaility: 
since, as he elsewmere tells us, lost time is never found again; and what we cail time enough, always proves Iit:..: 
enough: Let us then up and be doing, and doing to the purpose; so by dillgence shail we do more with less 
perplexity. 

ABCOE FGHIV KLMNO PQRST UVWXYZ abcde fghij kimno parst uwwxy2z 01234 56789 


I"BESNA'(C) cP ee LJ] Jawin/O'7+/?.>,< 


if time be of ail things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; 3:.°:":. 


as he e(sewhere telis us, lost Ume/s never found again; and what we cali time enough, always proves (little enous: : 
Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less per plex/ty. 


ABCDE FGHIJ KLIVWNO PQRST UVWXYZ abcde fghij kimno par st uvwxyz 01234 §6789 
"eS%a'()st-aC]ijow~-N10';+4/7.>,< 


Hf time be of ail things the most precious, wasting time must be, as Poor Richard says, the greatest prodigaii.’. 
since, as he eizewhere tails us, lost time Is never found again; and what we call time enough, always provs* 
little enough: Let us then up and be doing, and doing to the purpose; so by diligence shail we do more with |s.. 
perplexity. 


10 point 
ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 01234 5678S 


"#SRS'C): *% eB L] Ser ~_N|O@';34+/7.5,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, the 
greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and 
what we call time enough, always proves littie enough: Let us then up and be doing, and 
doing to the purpose; so by diligence shail we do more with less perplexity. 


ABCOE FGHIJ KLMINO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 01234 56789 
P"H#SEKREa' (Ce *-tL JEP rw NI O's 4/7.>,< 


[f time be of all things the most precious, wasting time must be, as Poor Richard says, the 
greatest prodigality; since, as he e/sewhere tells us, lost time is never found again; and w*:: 
we call time enough, always proves little enough: Let us then up and be doing, and doing tc 
the purpose; so by diligence shall we do more with Jess perplexity. 


ABCDE FGHIJ KLMNO PORST UVWAXYZ abcde fghij kimno paqrst uvwxyz 01234 56789 
"#SKZSa' (CO): Fe aL] §~r-~-N1O'3;+/7.>,< 


If time be of ail things the most precious, wasting time must be, as Poor Richard says, 
the greatest prodigality; since, as he elsewhere tells us, lost time is never found again- 
and what we cail time enough, always proves littie enough: Let us then up and be doin<. 
and doing to the purpose; so by diligencsa shail we do more with less perplexity. 
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12 point 
ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kilmno pqrst uvwxyz 01234 
§6789 


I"#S%&'():*-8L]f]-~_\]@';4/?.>,< 


If time be of all things the most precious, wasting time must be, as Poor 
Richard says, the greatest prodigality; since, as he elsewhere tells us, lost 
time is never found again; and what we call time enough, always proves little 
enough: Let us then up and be doing, and doing to the purpose; so by 
diligence shall we do more with less perplexity. 


ABCDE FGHIJ KLMNO PQRST UUWXY2Z abcde fghij kimno pqrst uwwxyz 01234 
§6789 


P"#SKRSa'(C):% -2 LILI er~NI G's 3+ /27.5,< 


If time be of ali things the most precious, wasting time must be, as Poor 
Richard says, the greatest prodigality; since, as he elsewhere tells us, lost 
time is never found again; and what we call time enough, always proves little 
enough: Let us then up and be doing, and doing to the purpose; so by 
diligence shall we do more with less perplexity. 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 
01234 56789 


I eSKRR'():*-2L]{}-~~_\]O';34/?.>,< 


If time be of all things the most precious, wasting time must be, as Poor 
Richard says, the greatest prodigality; since, as he elsewhere teils us, 
lost time is never found again; and what we call time enough, always 
proves little enough: Let us then up and be doing, and doing to the 
purpose; so by diligence shall we do more with less perplexity. 
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Old English, 8, 14, and 18 point only. (his font is called 
‘sldenglish’' on line.) 


B pont 
ABIBE FS RIDPLANG PERST AB XB abode fqhtt bleu porst ubbege OTH 56789 
rg ' te f]erN Os .>e< 


E bere be of all trin most precious, ion tone must be, as Poor sags, the createst stice, as 
elsetuhere tells us, "ew ernie En epee ta F cme sn po “Let us tren ak 


Botta, and Boing to tre purpose: so by diligence shall ne bo more fxtth less perplexite 


V4 point 


ARQDE FOMIZ KLMNO LORST UY WKY abede tghij kimno 
porst uvwxyz 01234 56788 


" . Sian_\ @; oes 


3% time be of all things the most precious, wasting time must be, a= Poor 
Richard «ays, the grentect prodigaulity;since, ax he elsewhere tellx os, loxt 
time ix never found agsinjand what we call time encugh, ahvays proves 
little encogh: Det ux then up and be doing, and doing to the purpose;so by 
diigerwe shall we do more with less perplexity. 


TS point 


ABQDE FOHMIT KRIMNO PORST UVWRYS 
abcde fghij kimno poqrst uvwxyz 01234 567389 


gE ae IN te Oa 


4£ time be of all things the most precious, wasting time 
must be, as Poor Richard says, the greatest prodigality 
since, as he elsewhere tells os, lost time is nver found 
again and what we call thrme enough, ahvays proves little 
enough and 7 think Im wasting time tuping all this stuff 
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ABCOE FGHIJ KLMNO PORST UWWXYZ 01234 552785 
P's "U)t- [PrN Of 7.3,< 


IT COULD PROBABLY BE SHOWN EY FACTS AND FIGURES THAT THERE IS XO 
RISTINCTLY NATIVE AMERICAN CRIMINAL CLASS EXCEPT CONGRESS, 


~ MARK TWAIN 


Pisyhlll feat, 18 paiat caly 


ASCE FEMI XLMMD PURST UVWEY2 beds fy hij kimas pest newrys £123 
Ph ESTA Cy: ee LIS Jrwln Ose/t.>,< 


¥ time be ef all thisgs ths mest precions. wasting time mast be. as Poor Bishsrd says, the greatest prediqailty; sass, es by 
tlsswhare tails us, lust time is sewer found tqaizs isd what we call time sasugh, alweys prews Uttls racmg& bat us than ap sad be 
ising, and dalag ts the parpasas saby diUgesce shall we de mers with lass perpiszity. 


Script, 18 point only. This font appears to be almost identical to the 
**Coronet’’ font from SAIL, except that the period and one other glyph 
of Coronet are missing a row, and Coronet is supposed to be 16 point. 


(They are both really the same size.) 


ABCDE ICHIG KLMNO POLST UVWKYZ abede 


ehij LL sia parst avw ryz 01234 56789 
be 3 Pe ON Or 5 


J} bid ibe af all things thevmeat percieas, wasting time mast he; as Pea 
Richesd say, the greatest predigality; since, as ke jl ches tell, a4, load time is 
never Vand again; anal what we pall time sneagh, always proves Lith, enough: ae 


as len ap nal be doing, aka doing ts the purpese; se by diligence mall we de 
mere with ler perplexity, 
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STEReh, 16 PEI CHLY, Ce LELEG EcSse 


GCEERE FERNS LN CERSY OVENYZ 228% 867s 
g" 2 % § BAL Jr ~L\ O56 ,>,K< 

TEE €CGGED FER iS GH EXEELLENY ELEIEE FER 
PEEFENRS PLEMETIEDS, IT 2GS THE GOYGNTGEE Er 
EEINE GLOEST (OLECOGELE, 


o¢ SOARENM o¢ 


SIGN, 22 POINT ONLY 


ABCDE FGHIJ KLMNO PQRST 
UVWAY2Z *< 0123456789 


Pye ' se-= [JA~_@;/.>,< 
THIS FONT WAS INVENTED BY A 
DRAFTSMAN WHO HAD LOST HIS 
FRENCH CURVE. #SO IT GOES < 


LOWER CASE LIS +, LOWER CASE 
RIS &. | 
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Stare hershey font This font is identical to the hershey font except that the point sizes are one poiz: 
smailer, and the width tables are those used for the real typesetter. Hence, this font is useful whe:: 
previewing documents that are to be sent to a typesetter to make sure the spacing, paging, and so or 
right There are Roman, /iaitc and Bold in 8, 9, 10, 11, 12, 14, and 16 point The following exampi: : 
ere 10 point. 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno parst uvwxyz 01234 56789 
"ER ke'() rte [] LP rmr~uNl[O' s+ /2.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest 
prodigality; since, es he elsewhere tells us, lost time is never found again; and what we call time 
enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by 
diligence shail we do more with less perplexity. 


ABCDE FGHIJ KLM NO PQRST UVWXYZ abcde fry kirrmo pgrst wurye 01234 56789 
I"#SZ&'()it-s [J LJ rr~N[Ots+ 27.>,< 


Lf tare be of all things the most precious, wasting tune rrust be, as Poor Richard says, the greatest prodwgaiity; 
since, as he elsewhere tells us, lost tzne is never found agam, and uhat ue call tere enough, akuay proves 
kitle enough: Let us then up and be dowy, and doing tb the pwipose; so by diligence shail ur do more utth less 
perplexity. 


ABCDE FGHIJ KLM NO PQRST UVWXYZ abcde fghij kimmo parst uvwxyz 01234 56789 
"#S%ae'(): *-=Ha []iL ir w~-N]@'s+ 77.>,< 


If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest 
prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time 
enough, always proves little enough: Let us then up and be doing, and doing to the purpose, so by 
diligence shall we do more with less perplexity. 


8 pomt Roman, Bald and /talc. 

9 point Roman, Bald, and /talic. 

10 point Roman, Bald, and /iaix. 

11 point Roman, Bold, and /iiic. 

12 point Roman, Bold, and Jinkc. 

14 point Roman, Bold, and /imiuc. 


16 point Roman, Bold, and /talic. 
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Times fonts, roman, italic, and bold. 10 point only. 

These fonts showed up in a directory labelled “timesroman” along with three other fonts which turned out 
to be nonie, meteor, and news gothic. They are probably not really times fonts, but seem to be pretty close. 
Notice the top of the “2” for a clear difference from a real Times Roman font. 


It is our desire to have a real, digitized version of the times fonts from the phototypesetter. We eventually 
plan to do this. At that point, the times font will probably replace the hershey font as the default. Such a 
Times font is already available from Johns Hopkins University for a fee, but we couldn't redistribute it, so 
we plan do digitize them ourselves. 


10 Point 

ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno parst avwxyz 01234 56789 
I" #HSLE()sB-2 [J LY ~N| O'34/2?.5,< 

1s a.*1 2 7) @, O, , 4, 4, %, fi, fl, HF, fH, HH, °, f,', peo 


ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno parst uvowryz 01234 56789 
I" HELE ()s®-eL] LJ -~_N\| Otte 72.>, < 
1's 2“ “oe © Ty y Mey ty %e SSL, Se FF, Ff, ', 99? 


ABCDE FGHIJ KLMNO PORST UVWXYZ abede fghij klmmo pqrst avwxyx 01234 56789 
PASTS’ ()se-2[] SP a ~_N| O';5+/27.5,< 
’ 7 | “7, @, H , 4, 4, %, fi, fl, ff, fi, ffl, °, f,', ¢ 9° 
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UNIXt Assembler Reference Manual 


Dennis M. Ritchie 


Bell Laboratories 
Murray Hill, New Jersey 07974 


0. Introduction 


This document describes the usage and input syntax of the UNIX PDP-11 assembler as. 
The details of the ppp-11 are not described. 


The input syntax of the UNIX assembler is generally similar to that of the DEC assembler 
PAL-11R, although its internal workings and output format are unrelated. It may be useful to 
read the publication DEC-11-ASDB-D, which describes PAL-11R, although naturally one must use 
care in assuming that its rules apply to as. 


As is a rather ordinary assembler without macro capabilities. It produces an output file 
that contains relocation information and a complete symbol table; thus the output is acceptable 
to the UNIX link-editor /d, which may be used to combine the outputs of several assembler runs 
and to obtain object programs from libraries. The output format has been designed so that if a 
program contains no unresolved references to external symbols, it is executable without further 
processing. 


1. Usage 
as is used as follows: 


as [ —u] [ —o output) file,... 


If the optional ‘‘—u’’ argument is given, all undefined symbols in the current assembly will be 
made undefined-external. See the .globl directive below. 


The other arguments name files which are concatenated and assembled. Thus programs 
may be written in several pieces and assembled together. 


The output of the assembler is by default placed on the file a.our in the current directory; 
the ‘‘—o”’ flag causes the output to be placed on the named file. If there were no unresolved 
external references, and no errors detected, the output file is marked executable; otherwise, if 
it is produced at all, it is made non-executable. 


2. Lexical conventions 


Assembler tokens include identifiers (alternatively, ‘‘symbols’’ or ‘tnames’’), temporary 
symbols, constants, and operators. 


2.1 Identifiers 


An identifier consists of a sequence of alphanumeric characters (including period ‘‘.”’, 
underscore ‘‘_’’, and tilde ‘*~’’ as alphanumeric) of which the first may not be numeric. Only 
the first eight characters are significant. When a name begins with a tilde, the tilde is discarded 
and that occurrence of the identifier generates a unique entry in the symbol table which can 
match no other occurrence of the identifier. This feature is used by the C compiler to place 


+ UNIX is a Trademark of Bell Laboratories. 
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names of local variables in the output symbol table without having to worry about making them 
unique. 


2.2 Temporary symbols 
A temporary symbol corisists of a digit followed by ‘‘f’’ or ‘‘b’’. Temporary symbols are 
discussed fully in §5.1. 


2.3 Constants 


An octal constant consists of a sequence of digits; **8” and ‘‘9”° are taken to have octal 
value 10 and 11. The constant is truncated to 16 bits and interpreted in two’s complement 
notation. 


A decimal constant consists of a sequence of digits terminated by a decimal point ‘‘.’’. 
The magnitude of the constant should be representable in 15 bits; i.e., be less than 32,768. 


A single-character constant consists of a single quote ‘‘’’’ followed by an ASCII character 
not a new-line. Certain dual-character escape sequences are acceptable in place of the ASCII 
character to represent new-line and other non-graphics (see String statements, §5.5). The 
constant’s value has the code for the given character in the least significant byte of the word 
and is null-padded on the left. 


A double-character constant consists of a double quote ‘‘"’’ followed by a pair of ASCII 
characters not including new-line. Certain dual-character escape sequences are acceptable in 
place of either of the ASCII characters to represent new-line and other non-graphics (see String 
statements, §5.5). The constant’s value has the code for the first given character in the least 
significant byte and that for the second character in the most significant byte. 


2.4 Operators 
There are several single- and double-character operators; see §6. 


2.5 Blanks 


Blank and tab characters may be interspersed freely between tokens, but may not be used 
within tokens (except character constants). A blank or tab is required to separate adjacent 
identifiers or constants not otherwise separated. 


2.6 Comments 


The character ‘‘/”’ introduces a comment, which extends through the end of the line on 
which it appears. Comments are ignored by the assembler. 


3. Segments 


Assembled code and data fall into three segments: the text segment, the data segment, 
and the bss segment. The text segment is the one in which the assembler begins, and it is the 
one into which instructions are typically placed. The UNIX system will, if desired, enforce the 
purity of the text segment of programs by trapping write operations into it. Object programs 
produced by the assembler must be processed by the link-editor /d (using its ‘‘—n’’ flag) if the 
text segment is to be write-protected. A single copy of the text segment is shared among all 
processes executing such a program. 


The data segment is available for placing data or instructions which will be modified dur- 
ing execution. Anything which may go in the text segment may be put into the data segment. 
In programs with write-protected, sharable text segments, data segment contains the initialized 
but variable parts of a program. If the text segment is not pure, the data segment begins 
immediately after the text segment; if the text segment is pure, the data segment begins at the 
lowest 8K byte boundary after the text segment. 


The bss segment may not contain any explicitly initialized code or data. The length of the 
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bss segment (like that of text or data) is determined by the high-water mark of the location 
counter within it. The bss segment is actually an extension of the data segment and begins 
immediately after it. At the start of execution of a program, the bss segment is set to 0. Typi- 
cally the bss segment is set up by statements exemplified by 


lab: . = .+10 


The advantage in using the bss segment for storage that starts off empty is that the initialization 
information need not be stored in the output file. See also Location counter and Assignment 
Statements below. 


4. The location counter 


One special symbol, ‘‘.’’, is the location counter. Its value at any time is the offset 
within the appropriate segment of the start of the statement in which it appears. The location 
counter may be assigned to, with the restriction that the current segment may not change; 
furthermore, the value of ‘‘.”’ may not decrease. If the effect of the assignment is to increase 
the value of ‘‘.’’, the required number of null bytes are generated (but see Segments above). 


5. Statements 


A source program is composed of a sequence of statements. Statements are separated 
either by new-lines or by semicolons. There are five kinds of statements: null statements, 
expression statements, assignment statements, string statements, and keyword statements. 


Any kind of statement may be preceded by one or more labels. 


5.1 Labels 


There are two kinds of label: name labels and numeric labels. A name label consists of a 
name followed by a colon (:). The effect of a name label is to assign the current value and 
type of the location counter ‘‘.’’ to the name. An error is indicated in pass | if the name is 
already defined; an error is indicated in pass 2 if the ‘‘.”’’ value assigned changes the definition 


of the label. 


A numeric label consists of a digit 0 to 9 followed by a colon (:). Such a label serves to 
define temporary symbols of the form ‘‘7b’’ and ‘‘nf?’, where nis the digit of the label. As in 
the case of name labels, a numeric label assigns the current value and type of ‘‘.”’ to the tem- 
porary symbol. However, several numeric labels with the same digit may be used within the 
same assembly. References of the form ‘‘nf”’’ refer to the first numeric label ‘7: forward 
from the reference; ‘‘nb’’ symbols refer to the first ‘‘n :’’ label backward from the reference. 
This sort of temporary label was introduced by Knuth [The Art of Computer Programming, Vol 1: 
Fundamental Algorithms]. Such labels tend to conserve both the symbol table space of the 
assembler and the inventive powers of the programmer. 


5.2 Null statements 


A null statement is an empty statement (which may, however, have labels). A null state- 
ment is ignored by the assembler. Common examples of null statements are empty lines or 
lines containing only a label. 


5.3 Expression statements 


An expression statement consists of an arithmetic expression not beginning with a key- 
word. The assembler computes its (16-bit) value and places it in the output stream, together 
with the appropriate relocation bits. 
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5.4 Assignment statements 


An assignment statement consists of an identifier, an equals sign (=), and an expression. 
The value and type of the expression are assigned to the identifier. It is not required that the 
type or value be the same in pass 2 as in pass 1, nor is it an error to redefine any symbol by 
assignment. 


Any external attribute of the expression is lost across an assignment. This means that it 
is not possible to declare a global symbol by assigning to it, and that it is impossible to define a 
symbol to be offset from a non-locally defined global symbol. 


As mentioned, it is permissible to assign to the location counter ‘‘.’’. It is required, how- 
ever, that the type of the expression assigned be of the same type as ‘*.’’, and it is forbidden 
to decrease the value of ‘‘.’’. In practice, the most common assignment to ‘‘.’’ has the form 
“*. ==. +n” for some number 7; this has the effect of generating 7 null bytes. 


5.5 String statements 


A String statement generates a sequence of bytes containing ASCII characters. A string 
statement consists of a left string quote ‘‘<’’ followed by a sequence of ASCII characters not 
including newline, followed by a right string quote ‘‘>’’. Any of the ASCII characters may be 
replaced by a two-character escape sequence to represent certain non-graphic characters, as fol- 
lows: 


\n NL (012) 
\s sP (040) 
\t HT (011) 
\e EOT (004) 
\O0 Nut (000) 
\r CR (015) 
\a ACK (006) 
\p -PFX ~~ (033) 
WW \ 

\> > 


The last two are‘included so that the escape character and the right string quote may be 
represented. The same escape sequences may also be used within single- and double-character 
constants (see §2.3 above). 


5.6 Keyword statements 


Keyword statements are numerically the most common type, since most machine instruc- 
tions are of this sort. A keyword statement begins with one of the many predefined keywords 
of the assembler; the syntax of the remainder depends on the keyword. All the keywords are 
listed below with the syntax they require. 


6. Expressions 


An expression is a sequence of symbols representing a value. Its constituents are 
identifiers, constants, temporary symbols, operators, and brackets. Each expression has a type. 


All operators in expressions are fundamentally binary in nature; if an operand is missing 
on the left, a 0 of absolute type is assumed. Arithmetic is two’s complement and has 16 bits of 
precision. All operators have equal precedence, and expressions are evaluated strictly left to 
right except for the effect of brackets. 
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6.1 Expression operators 
The operators are: 


(blank) 


ae 


\/ 


\> 
\< 


% 


when there is no operand between operands, the effect is exactly the same as if a ‘‘+”’ 
had appeared. 


addition 

subtraction 

multiplication 

division (note that plain ‘‘/’’ starts a comment) 
bitwise and 

bitwise or 

logical right shift 

logical left shift 

modulo 


a! bis a or (not 5); i.e., the or of the first operand and the one’s complement of the 
second; most common use is as a unary. 


result has the value of first operand and the type of the second; most often used to 
define new machine instructions with syntax identical to existing instructions. 


Expressions may be grouped by use of square brackets ‘‘[]’’. (Round parentheses are 
reserved for address modes.) 


6.2 Types 


The assembler deals with a number of types of expressions. Most types are attached to 
keywords and used to select the routine which treats that keyword. The types likely to be met 
explicitly are: 


undefined 


Upon first encounter, each symbol is undefined. It may become undefined if it is 
assigned an undefined expression. It is an error to attempt to assemble an undefined 
expression in pass 2; in pass 1, it is not (except that certain keywords require operands 
which are not undefined). 


undefined external 


absolute 


text 


data 


bss 


A symbol which is declared .globl but not defined in the current assembly is an 
undefined external. If such a symbol is declared, the link editor /d must be used to 
load the assembler’s output with another routine that defines the undefined reference. 


An absolute symbol is defined ultimately from a constant. Its value is unaffected by 
any possible future applications of the link-editor to the output file. 


The value of a text symbol is measured with respect to the beginning of the text seg- 
ment of the program. If the assembler output is link-edited, its text symbols may 
change in value since the program need not be the first in the link editor’s output. 
Most text symbols are defined by appearing as labels. At the start of an assembly, the 
value of ‘*."’ is text 0. 


The value of a data symbol is measured with respect to the origin of the data segment 
of a program. Like text symbols, the value of a data symbol may change during a sub- 
sequent link-editor run since previously loaded programs may have data segments. 
After the first .data statement, the value of ‘‘.”” is data 0. 


The value of a bss symbol is measured from the beginning of the bss segment of a 
program. Like text and data symbols, the value of a bss symbol may change during a 
subsequent link-editor run, since previously loaded programs may have bss segments. 
After the first .bss statement, the value of ‘*.” is bss 0. 
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external absolute, text, data, or bss 
symbols declared .globl but defined within an assembly as absolute, text, data, or bss 
symbols may be used exactly as if they were not declared .globl; however, their value 
and type are available to the link editor so that the program may be loaded with others 
that reference these symbols. 


register 
The symbols 
rd... r§ 
frO ... fr5 
sp 
pe 


are predefined as register symbols. Either they or symbols defined from them must be 
used to refer to the six general-purpose, six floating-point, and the 2 special-purpose 
machine registers. The behavior of the floating register names is identical to that of 
the corresponding general register names; the former are provided as a mnemonic aid. 


other types 
Each keyword known to the assembler has a type which is used to select the routine 
which processes the associated keyword statement. The behavior of such symbols 
when not used as keywords is the same as if they were absolute. 


6.3 Type propagation in expressions 


When operands are combined by expression operators, the result has a type which 
depends on the types of the operands and on the operator. The rules involved are complex to 
state but were intended to be sensible and predictable. For purposes of expression evaluation 
the important types are 


undefined 
absolute 

text 

data 

bss 

undefined external 
other 


The combination rules are then: If one of the operands is undefined, the result is undefined. If 
both operands are absolute, the result is absolute. If an absolute is combined with one of the 
‘“‘other types’? mentioned above, or with a register expression, the result has the register or 
other type. As a consequence, one can refer to r3 as ‘‘r0+3°’. If two operands of ‘‘other 
type’” are combined, the result has the numerically larger type An ‘‘other type’’ combined with 
an explicitly discussed type other than absolute acts like an absolute. 


Further rules applying to particular operators are: 


+ If one operand is text-, data-, or bss-segment relocatable, or is an undefined external, the 
result has the postulated type and the other operand must be absolute. 


— If the first operand is a relocatable text-, data-, or bss-segment symbol, the second 
operand may be absolute (in which case the result has the type of the first operand); or 
the second operand may have the same type as the first (in which case the result is abso- 
lute). If the first operand is external undefined, the second must be absolute. All other 
combinations are illegal. 


This operator follows no other rule than that the result has the value of the first operand 
and the type of the second. 
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others 
It is illegal to apply these operators to any but absolute symbols. 


7. Pseudo-operations 


The keywords listed below introduce statements that generate data in unusual forms or 
influence the later operations of the assembler. The metanotation 


{ stuff]... 


means that 0 or more instances of the given stuff may appear. Also, boldface tokens are 
literals, italic words are substitutable. 


7.1 .byte expression [ , expression] ... 


The expressions in the comma-separated list are truncated to 8 bits and assembled in suc- 
cessive bytes. The expressions must be absolute. This statement and the string statement 
above are the only ones that assemble data one byte at at time. 


7.2 .even 


66 (99 
° 


If the location counter 
assembled at a word boundary. 


is odd, it is advanced by one so the next statement will be 


7.3. «if expression 


The expression must bé absolute and defined in pass 1. If its value is nonzero, the .if is 
ignored; if zero, the stateménts between the .if and the matching .endif (below) are ignored. 
-if may be nested. The effect of .if cannot extend beyond the end of the input file in which it 
appears. (The statements are not totally ignored, in the following sense: .ifs and .endifs are 
scanned for, and moreover all names are entered in the symbol table. Thus names occurring 
only inside an .if will show up as undefined if the symbol table is listed.) 


7.4 .endif 
This statement marks the end of a conditionally-assembled section of code. See .if above. 


7.5 .globl name [ , name ] ... 


This statement makes the names external. If they are otherwise defined (by assigriment or 
appearance as a label) they act within the assembly exactly as if the .globl statement were not 
given, however, the link editor /d may be used to combine this routine with other routines that 
refer these symbols. 


Conversely, if the given symbols are not defined within the current assembly, the link 
editor can combine the output of this assembly with that of others which define the symbols. 
As discussed in §1, it is possible to force the assembler to make all otherwise undefined sym- 
bols external. 


7.6 .text 
7.7 .data 


7.8 .bss 


These three pseudo-operations cause the assemibler to begin assembling into the text, 
data, or bss segment respectively. Assembly starts in the text segment. It is forbidden to 
assemble any code or data into the bss segment, but symbols may be defined and ‘‘ .”’ moved 
about by assignment. 
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7.9 .comm name , expression 
Provided the name is not defined elsewhere, this statement is equivalent to 


-globl name 
name = expression ~ name 


That is, the type of name is ‘‘undefined external’’, and its value is expression. In fact the name 
behaves in the current assembly just like an undefined external. However, the link-editor /d 
has been special-cased so that all external symbols which are not otherwise defined, and which 
have a non-zero value, are defined to lie in the bss segment, and enough space is left after the 
symbol to hold expression bytes. All symbols which become defined in this way are located 
before all the explicitly defined bss-segment locations. 


8. Machine instructions 


Because of the rather complicated instruction and addressing structure of the PpP-11, the 
syntax of machine instruction statements is varied. Although the following sections give the 
syntax in detail, the machine handbooks should be consulted on the semantics. 


8.1 Sources and Destinations 


The syntax of general source and destination addresses is the same. Each must have one 
of the following forms, where reg is a register symbol, and expr is any sort of expression: 


syntax words mode 

reg 0 00+ reg 
(reg) + 0 20+ reg 
— (reg) 0 40+reg 
expr (reg) 1 60 + reg 
(reg) 0 10+,reg 
* reg 0 10+ reg 
* (reg) + 0 30+ reg 
* — (reg) 0 50+,reg 
* (reg) l 70 + reg 
*expr(reg) 1 70+ reg 
expr l 67 

$expr 1 27 

* expr 1 77 

* $ expr 1 37 


The words column gives the number of address words generated; the mode column gives the 
octal address-mode number. The syntax of the address forms is identical to that in DEC assem- 
blers, except that ‘‘*’’ has been substituted for ‘‘@’’ and ‘‘$’’ for ‘‘#’’, the UNIX typing con- 
ventions make ‘‘@”’ and ‘‘#”’ rather inconvenient. 


Notice that mode ‘‘*reg”’ is identical to ‘‘(reg)’’; that ‘‘*(reg)’’ generates an index word 
(namely, 0); and that addresses consisting of an unadorned expression are assembled as pc- 
relative references independent of the type of the expression. To force a non-relative refer- 
ence, the form ‘‘*$expr’’ can be used, but notice that further indirection is impossible. 


8.3 Simple machine instructions 
The following instructions are defined as absolute symbols: 
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cle 
clv 
clz 
cin 
sec 
sev 
sez 
sen 


They therefore require no special syntax. The ppp-11 hardware allows more than one of the 
“‘clear’’ class, or alternatively more than one of the ‘‘set’’ class to be or-ed together; this may 
be expressed as follows: 


cle | clv 


8.4 Branch 

The following instructions take an expression as operand. The expression must lie in the 
same segment as the reference, cannot be undefined-external, and its value cannot differ from 
the current location of ‘‘.”’ by more than 254 bytes: 


br blos 

bne bve 

beq bvs 

bge bhis 

bit bec (= bee) 
bet bec 

ble blo 

bpl bes 

bmi bes (= bes) 
bhi 


bes (‘‘branch on error set’’) and bee (‘‘branch on error clear’’) are intended to test the error bit 
returned by system calls (which is the c-bit). 


8.5 Extended branch instructions 

The following symbols are followed by an expression representing an address in the same 
segment as ‘‘.””. If the target address is close enough, a branch-type instruction is generated; if 
the address is too far away, a jmp will be used. 


jbr jlos 
jne jve 
jeq jvs 
jge jhis 
jit jec 
jgt jec 
jle jlo 
jpl jes 
jmi jes 
jhi 


jbr turns into a plain jmp if its target is too remote; the others (whose names are contructed by 
replacing the ‘‘b’’ in the branch instruction’s name by ‘‘j’*) turn into the converse branch over 
a jmp to the target address. 
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8.6 Single operand instructions 


The following symbols are names of single-operand machine instructions. The form of 
address expected is discussed in §8.1 above. 


clr sbcb 
clrb ror 
com rorb 
comb rol 
inc rolb 
incb asr 
dec asrb 
decb asl 
neg aslb 
negb jmp 
adc swab 
adcb tst 
sbc tstb 


8.7 Double operand instructions 


The following instructions take a general source and destination (§8.1), separated by a 
comma, as operands. 


mov 
movb - 
cmp 
empb 
bit 
bitb 
bic 
bicb 
bis 
bisb 
add 
‘sub 


8.8 Miscellaneous instructions 


The following instructions have more specialized syntax. Here reg is a register name, src 
and dst a general source or destination (§8.1), and expr is an expression: 


jst reg,dst 

rts reg 

sys expr 

ash src, reg (or, als) 

ashe _ src, reg (or, alsc) 
mul src, reg (or, mpy) 
div src, reg (or, dvd) 
xor reg, dst 

sxt dst 

mark expr 

sob reg, expr 


sys is another name for the trap instruction. It is used to code system calls. Its operand its 
required to be expressible in 6 bits. The expression in mark must be expressible in six bits, 
and the expression in sob must be in the same segment as ‘*.*’, must not be external- 
undefined, must be less than ‘*.’’, and must be within 510 bytes of ‘*.°’. 
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8.9 Floating-point unit instructions 
The following floating-point operations are defined, with syntax as indicated: 


cfcc 

setf 

setd 

seti 

set] 

elrf —_fdst 

negf  fdst 

absf = fds 

tstf fsre 

movf fsrc, freg (= Idf) 
movf _freg, fdst (= stf) 
movif src, freg (== ldcif ) 
movfi reg, dst (= stcfi) 
movof fsrc, /reg (= Idcdf ) 
movfo /reg, fdst (= stcfd) 
movie src, freg (= Idexp) 
movei /reg, dst (= stexp) 


addf fsrc, freg 
subf  fsrc, freg 
mulf (src, freg 
divf fsrc, /reg 
empf fsrc, freg 
modf src, freg 


Idfps src 
stfps dst 
stst dst 


fsre, fdst, and freg mean floating-point source, destination, and register respectively. Their syn- 
tax is identical to that for their non-floating counterparts, but note that only floating registers 
0-3 can be a /freg. 


The names of several of the operations have been changed to bring out an analogy with 
certain fixed-point instructions. The only strange case is movf, which turns into either stf or 
Idf depending respectively on whether its first operand is or is not a register. Warning: Idf sets 
the floating condition codes, stf does not. 


9. Other symbols 


Pe Gere 


The symbol ‘‘..”* is the relocation counter. Just before each assembled word is placed in 
the output stream, the current value of this symbol is added to the word if the word refers to a 
text, data or bss segment location. If the output word is a pc-relative address word that refers 
to an absolute location, the value of ‘‘..’’ is subtracted. 


Thus the value of ‘*..°’ can be taken to mean the starting memory location of the pro- 
gram. The initial value of ‘*..’’ is 0. 


9 


The value of ‘‘..’’ may be changed by assignment. Such a course of action is sometimes 
necessary, but the consequences should be carefully thought out. It is particularly ticklish to 
change ‘‘..*’ midway in an assembly or to do so in a program which will be treated by the 
loader, which has its own notions of ‘*‘..”’. 
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9.2 System calls 
System call names are not predefined. They may be found in the file /usr/include/sys.s 


10. Diagnostics 


When an input file cannot be read, its name followed by a question mark is typed and 
assembly ceases. When syntactic or semantic errors occur, a single-character diagnostic is typed 
out together with the line number and the file name in which it occurred. Errors in pass 1 
cause cancellation of pass 2. The possible errors are: 


) parentheses error 

J parentheses error 

> string not terminated properly 

. indirection (*) used illegally 

illegal assignment to ‘* .”” 

error in address 

branch address is odd or too remote 
error in expression 

error in local (‘‘f’’ or ‘‘b’’) type symbol 
garbage (unknown) character 

end of file inside an .if 

multiply defined symbol as label 

word quantity assembled at odd address 
phase error— ‘*.”’ different in pass 1 and 2 
relocation error 

undefined symbol 

syntax error 


xCWTPVOT"“HANMH>-: 


Index - i 


UNIX MASTER INDEX 


The UNIX Master Index is a cumulative index; it brings together the indexes 
of all the UNIX volumes. The Master Index appears at the end of each 
volume. 


Each entry is followed by one or more shortened volume titles, indicating the 
volumes in which the topic is discussed and the pages containing the informa- 
tion. The volumes and their shortened titles are shown in the following table: 


Shortened Volume Title 
General use GEN 
Programming PGM 


System manager SYS 


If a topic is discussed in two or more volumes, the shortened volume names 
are presented in alphabetical order. For example, an entry in the Master 
Index might appear in the following way: 


ed line editor 

description, GEN 4-8 to 4-9, SYS 4-6 
ed__.hup file 

saving text, GEN 2-6 


This entry indicates that a description of the ed line editor can be found on 
pages 4-8 through 4-9 of the GEN volume and page 4-6 of the SYS volume. 
The ed__.hup file is discussed on page 3-48 of the GEN volume. 


ACRONYMS AND MNEMONICS 


The acronym (or mnemonic) is the preferred entry. The acronym is cross- 
referred from the complete form. 


DEFINITIONS 
Defined terms and glossary terms are indexed. 
HOMONYMS 


Things of the same name but different meaning are followed by a descriptive 
word or by an abbreviation in parentheses. 


KEYS FOR EXAMPLES, FIGURES, TABLES, AND FOOTNOTES 


Page references for example, figure, and table index entries are keyed. Exam- 
ple: 


Example 4-13E 
Figure 4-13F 
Table 4-13T 
Footnote 4-13n 


ii- Index 
NONALPHABETIC CHARACTERS 


Entries containing leading nonalphabetic characters (symbols, numbers, and 
punctuation) are placed at the beginning of the index. Nonalphabetic charac- 
ters within index entries are sorted before alphabetic characters. 


Nonalphabetic characters that serve as terms are indexed in a spelled-out 
form whenever possible. 


INDEX 


oo 


command (DC) 
descripton, GEN 2-58 
command (ed) 
escaping to use UNIX command, 
GEN 3-51E 
command (ex) 
description, GEN 3-95 
command (Mail) 
marking commands for the shell, 
GEN 2-28 
escape (Mail) 
description, GEN 2-25 
$ character (ed) 
printing last line, GEN 3-28 
% command (DC) 
descripton, GEN 2-57 
% prompt 
defined, GEN 3-5 
& command (ex) 
description, GEN 3-96 
+ command (DC) 
descripton, GEN 2-57 
- command (DC) 
descripton, GEN 2-57 
- command (Mail) 
printing previous message, GEN 
2-28 
.. file 
defined, GEN 4-63 
/etc/passwd file 
defined, GEN 4-66 


o<. 


~-—. 


o-. 


o—: 


/etc/re command file 
starting network servers, SYS 5-49 
/sys directory 
contents, SYS 5-36T 
/sys/sys directory 
file prefixes, SYS 5-36T 
/usr/spool/mail directory 
system mailbox and, GEN 2-17 
0 command 
defined, GEN 5-88 
0 command (troff) 
right-justifying digits, GEN 5-87 
0 macro (me) 
specifying section titles for 
contents, GEN 5-41 
1822 interface 
See imp network interface driver 
lc command (me) 
defined, GEN 5-43 
returning one-column format, 
GEN 5-35 
1C command (ms) 
returning one-column format, 
GEN 5-6 
2c command (me) 
defined, GEN 5-43 
specifying two-column format, 
GEN 5-35 
2C command (ms) 
specifying two-column format, 
GEN 5-6 
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3Com Ethernet controller 
See ec network interface driver 
4.2BSD file system 
file set, SYS 5-32T 
4.2BSD Interprocess Communication 
Primer 
See also Interprocess 
communication 
4.2BSD Interprocess Communication 
Primer, SYS 3-5 to 3-28 
4.2BSD Line Printer Spooler 
Manual, PGM 4-99 to 4-105 
See also Line printer spooling 
system (4.2BSD) 
4.2BSD system 
4.1BSD files and, SYS 5-32 to 
5-34 
4.1BSD language processors and, 
SYS 5-34 
adding device drivers, SYS 5-88 
adding users, SYS 5-43 
bug fixes and changes, SYS 1-3 to 
1-21 
changes to the kernel, SYS 5-3 to 
5-15 
configuring for networking support, 
SYS 5-47 to 5-51 
configuring multiple networks, 
SYS 5-48 
creating boot floppy, SYS 5-35 
disk space and, SYS 5-18 
distribution format, SYS 5-18 
hardware supported, SYS 5-17 
installing on VAX/VMS, SYS 
5-17 to 5-71 
making boot cassette, SYS 5-35 
setting up, SYS 5-35 to 5-46 
source directory organization, SYS 
5-89T 
system manual, PGM 4-15 to 4-52 
tailoring to your site, SYS 5-43 
upgrading, SYS 5-32 to 5-34 
4.2BSD System Manual, PGM 4-15 
to 4-52 
: command (DC) 
description, GEN 2-63 
: escape (Mail) 
description, GEN 2-25 
; command (DC) 
description, GEN 2-63 
< symbol 
meaning, GEN 2-10 
= command (sed) 
defined, GEN 3-114 
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> symbol 
meaning, GEN 2-10 
? escape (Mail) 
description, GEN 2-26 
Eel 
pattern—matching and, GEN 2-8 
\* command (troff) 
entering comments in macros, 
GEN 5-89 
—exit function 
description, PGM 1-8 


A 


a command (ed) 
defined, GEN 3-34 
using, GEN 3-25 to 3-26 
a command (edit) 
entering, GEN 3-6E 
a command (ex) 
description, GEN 3-88 
A command (me) 
defined, GEN 5-46 
a command (sed) 
See also i command (sed) 
defined, GEN 3-108 
A command (vi) 
defined, GEN 3-78 
a command (vi) 
defined, GEN 3-80 
a option (hunt) 
defined, GEN 5-148 
a option (inv) 
defined, GEN 5-147 
a option (troff) 
defined, GEN 5-50 
a.out file 
as assembler and, GEN 6-53 
defined, GEN 4-63 
aardvark game 
4.2BSD and, SYS 1-17 
ab command (ex) 
See also una command (ex) 
description, GEN 3-87 
AB command (me) 
defined, GEN 5-46 
AB command (ms) 
entering abstract in text, GEN 
5-5 
ab command (nroff/troff) 
message output, GEN 5-81 
abbreviate command (ex) 
See ab command (ex) 


abort command (Ipc) 
description, PGM 4-103 
Absolute pathname 
See also Relative pathname 
defined, GEN 4-63 
description, GEN 4-33 
Abstract 
entering with -ms, GEN 5-5 
ac command (me) 
defined, GEN 5-46 
ACC LH/DH IMP interface 
See acc network driver 
acc network driver 
4,.2BSD improvement, SYS 1-15 
Accent 
creating with troff, GEN 5-88E 
entering with -ms, GEN 5-9 
new in -ms, GEN 5-19 
access system call 
4.2BSD improvement, SYS 1-10 
ACM (Association for Computing 
Machinery) 
formatting papers for, GEN 5-46 
acommute routine 
operators and, PGM 2-67 to 2-68 
Action statement (awk) 
description, PGM 3-7 to 3-9 
Active system 
defined, SYS 5-123 
Acute accent 
See Metacharacters 
ad command (nroff/troff) 
defined, GEN 5-61 
j register and, GEN 5-81 
ad driver 
4,.2BSD improvement, SYS 1-15 
ad.c device driver 
4,.2BSD improvement, SYS 5-12 
ADB debugging program 
4,2BSD improvement, SYS 1-5 
C and, GEN 2-15 
description, PGM 3-51 to 3-77 
addbib utility 
See also refer 
description, SYS 1-5 
addch routine 
defined, PGM 4-80 
Addition 
DC and, GEN 2-60 
Additive operator 
description, GEN 2-53 
Address (edit) 
defined for buffer line, GEN 3-18 


Address (sed) 
description, GEN 3-107 to 3-108 
Address Resolution Protocol 
See arp driver 
addstr routine 
defined, PGM 4-81 
Advisory lock 
compared to hard lock, SYS 1-33 
AE command (ms) 
TL command and, GEN 5-6 
af command (nroff/troff) 
defined, GEN 5-66 
Aho, A.V., & others 
awk programming language, PGM 
3-5 to 3-12 
AI command (ms) 
formatting author’s institution 
name, GEN 5-5 
Alias 
defined, GEN 2-21, 2-38, 4-63 
removing from shell, GEN 4-52 
specifying, GEN 2-21 
alias command (C shell) 
See also unalias command (C 
shell) 
displaying aliases, GEN 4-50E 
alias command (Mail) 
See also alternates command 
(Mail) 
See also metoo option 
defining an alias, GEN 2-21 
description, GEN 2-29 
restriction, GEN 2-21 
alias facility 
shell command files and, GEN 
4-43 
startup and, GEN 4-44 
uses for, GEN 4-43 to 4-44 
aliens game 
distribution and, SYS 1-17 


Allman, E. | 
-—Me Reference Manual, GEN 5-39 
to 5-48 
introduction to SCCS, PGM 3-238 
to 3-37 


sendmail, SYS 3-59 to 3-71 

Sendmail Installation and 
Operation Guide, SYS 2-27 to 
2-60 

writing papers with nroff using 
-me, GEN 5-21 to 5-38 

Allocator 
description, GEN 2-59 to 2-60 
design rationale, GEN 2-63 
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ALT key 
See ESCAPE key 
alternates command (Mail) 
description, GEN 2-29 
am command (nroff/troff) 
defined, GEN 5-64 
AM macro 
diacritical marks and, GEN 5-19 
Ampersand character (C shell) 
background jobs and, GEN 4-45 
routing output, GEN 4-44 
Ampersand character (ed) 
meaning, GEN 3-42 
printing, GEN 3-42 
s command and, GEN 3-33 to 
3-34 
turning off, GEN 3-34 
uses, GEN 3-42 
Ampersand character (edit) 
repeating s command, GEN 3-20 
Ampersand character (shell) 
multitasking and, GEN 1-29 
ANAME operator (C compiler) 
defined, PGM 2-65 
ANSI Standard X3.9 1978 
exceptions to, PGM 2-88 
extensions, PGM 2-82 to 2-83 
append command (ed) 
See a command (ed) 
append command (edit) 
See a command (edit) 
append command (ex) 
See a command (ex) 
Append mode _ 
See Input mode 
append option (Mail) 
defined, GEN 2-34 
Appendix 
specifying page numbers, GEN 
5-46 
apply program 
description, SYS 1-5 
ar 
4,.2BSD improvement, SYS 1-5 
ar command (me) 
defined, GEN 5-44 
Arabic number 
setting page number, GEN 5-44 
arff program _ 
4.2BSD improvement, SYS 1-18 
args command (ex) 
description, GEN 3-88 
Argument (C shell) 
defined, GEN 4-63 
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Argument (C shell) (Cont.) 
expanding, GEN 4-60 to 4-61 
Argument (nroff) 
defined, GEN 5-21 
argv variable (C shell) 
defined, GEN 4-63 
script files and, GEN 4-53 
Arithmetic expression (troff) 
entering, GEN 5-92 
Arithmetic language 
See BC language 
Arnold, K.C.R.C. 
Screen package, PGM 4-75 to 
4-98 
Arnold, K.C.R.C., & Toy, M.C. 
guide to the dungeons of doom, 
GEN 6-17 to 6-25 
arp driver 
4.2BSD improvement, SYS 1-15 
ARPA File Transfer Protocol 
ftp program and, SYS 1-6 
ARPA Telnet protocol 
See telnet program 
ARPANET 
sending mail to, GEN 2-26 
UUCP network and, GEN 2-26 
Array (awk) 
description, PGM 3-9 
Array element 
defined, GEN 2-51 
Array identifier 
description, GEN 2-50 
as assembler 
command line format, GEN 6-53E 
defined, GEN 6-53 
errors, GEN 6-64 
reference manual, GEN 6-53 to 
6-64, PGM 4-53 to 4-65 
segment types, GEN 6-54 
as command (nroff/troff) 
defined, GEN 5-64 
ask option (Mail) 
defined, GEN 2-34 
prompting for subject header, 
GEN 2-20 
setting, GEN 2-20 
askcce option (Mail) 
defined, GEN 2-34 
asm.sed file 
4.2BSD improvement, SYS 5-13 
Assembler 
replacing, SYS 5-118 
Assignment operator 
description, GEN 2-53 


Assignment statement (as) 
defined, GEN 6-56 
Assignment statement (BC) 
value and, GEN 2-48 
Association for Computing 
Machinery 
See ACM 
Asterisk character 
dot character and, GEN 3-40 
ed and, GEN 3-33 
printing multiple files, GEN 2-8 
shell and, GEN 4-33 
turning off, GEN 2-8 
uses, GEN 3-40 to 3-41 
zero and, GEN 3-41 
Asymmetric protocol 
defined, SYS 3-17 
At sign 
See also CTRL-H 
See also u command (edit) 
deleting a line, GEN 3-8E 
entering in text, GEN 2-4 
erasing characters on input line, 
GEN 2-4 
printing, GEN 3-39 
AU command (ms) 
formatting author’s name in text, 
GEN 5-5 
Author institution 
formatting in text, GEN 5-5 
Author name 
formatting in text, GEN 5-5 
Auto array 
specifying, GEN 2-54 
auto statement (BC) 
forming, GEN 2-55 
autoconf.c file 
4,.2BSD improvement, SYS 5-13 
Autoconfiguration 
building systems with config, SYS 
5-73 to 5-105 
hardware devices and, SYS 5-75 
requirements for VAX/VMS, SYS 
5-95 
autoindent option (ex) 
description, GEN 3-97 
autoindent option (vi) 
enabling, GEN 3-67 
lisp and, GEN 3-68 
using, GEN 3-73 
autoprint option (ex) 
description, GEN 3-98 
autoprint option (Mail) 
defined, GEN 2-34 


autowrite option (ex) 
description, GEN 3-98 

awk programming language 
command line format, PGM 3-5 
compared with grep, PGM 3-5 
defined, GEN 2-18, PGM 3-5 
description, PGM 3-5 to 3-12 
design, PGM 3-9 to 3-10 
execution time compared, PGM 

3-12T 

fields, PGM 3-5 
implementation, PGM 3-10 
printing output, PGM 3-6 
program structure, PGM 3-5 
records, PGM 3-5 
uses, PGM 3-10 
variables, PGM 3-8 


B 


B command (me) 
defined, GEN 5-46 
specifying bibliographic section, 
GEN 5-33 
b command (me) 
See also rh command (me) 
defined, GEN 5-42, 5-44 
entering, GEN 5-26 
specifying bold font, GEN 5-36 
specifying fill mode, GEN 5-26 
B command (ms) 
specifying boldface, GEN 5-8 
b command (sed) 
defined, GEN 3-114 
b command (troff) 
creating large brackets, GEN 
5-88E 
B command (vi) 
defined, GEN 3-78 
b command (vi) 
defined, GEN 3-80 
B flag (tar) 
reading block records, SYS 1-9 
writing block records, SYS 1-9 
b option (troff) 
defined, GEN 5-50 
B__CALL flag 
4,.2BSD improvement, SYS 5-6 
ba command (me) 
defined, GEN 5-45 
backgammon game 
See also teachgammon program 
4.2BSD improvement, SYS 1-17 
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Background command (C shell) 
defined, GEN 4-63 
Background job 
description, GEN 4-45 to 4-48 
reading input from terminal, GEN 
4-47E 
suspending, GEN 4-46 
Backslash character 
erasing, GEN 2-4 
Backslash character (ed) 
context search and, GEN 3-43 
restriction, GEN 3-33 
searching for, GEN 3-39E 
special characters and, GEN 3-39 
Backslash character (troff) 
translating for typesetter, GEN 
5-86 
Backus Functional Programming 
Language 
See FP programming language 
Bad block forwarding 
support, SYS 1-18 
bad144 program 
4.2BSD improvement, SYS 1-18 
Baden, S. 
Berkeley FP User Manual, PGM 
2-359 to 2-391 
badsect program 
See also fsck program 
4.2BSD improvement, SYS 1-18 
Base (BC) 
See also ibase; obase 
description, GEN 2-44 to 2-45 
be command (me) 
defined, GEN 5-43 
starting a column, GEN 5-35 
BC language 
C language and, GEN 2-43 
defined, GEN 2-43 
description, GEN 2-43 to 2-55 
displaying library of math 
functions, GEN 2-49 
output bases and, GEN 2-45 
restriction, GEN 2-43 
simple computations and, GEN 
2-43 to 2-44 
subscript restriction, GEN 2-46 
BC program 
exiting, GEN 2-49 
bemp library routine 
4.2BSD improvement, SYS 1-14 
beopy library routine 
4.2BSD improvement, SYS 1-14 


Index-6 


bd command (troff) 
defined, GEN 5-59 
BDATA operator (C compiler) 
defined, PGM 2-64 
beautify option (ex) 
description, GEN 3-98 
BEGIN/END pattern 
description, PGM 3-6 
Bell character 
printing, GEN 3-37 
Benson-Varian printer 
output filters and, PGM 4-102 
Berkeley font catalogue, GEN 6-27 
to 6-51 
Berkeley FP User’s Manual, PGM 
2-359 to 2-391 
See also FP programming 
language 
Berkeley network 
See Berknet 
Berkeley Pascal programming 
language 
user’s manual, PGM 2-159 to 
2-209 
Berkeley Pascal User Manual 
See also Pascal programming 
language 
Berkeley Pascal User Manual, PGM 
2-159 to 2-209 
Berkeley system 
See UNIX Operating System 
Berkeley VAX/UNIX Assembler 
Reference Manual, PGM 4-53 to 
4-65 
See also as assembler 
Berknet 
sending mail to, GEN 2-27 
bg command (C shell) 
continuing background jobs, GEN 
4-46E 
defined, GEN 4-64 
running suspended job in 
background, GEN 4-47 
bi command (me) 
defined, GEN 5-44 
Bibliographic citations 
formatting, GEN 2-13, 5-18, 5-33 
specifying, GEN 5-34F 
Bibliographic databases 
See roffbib program, SYS 1-8 
Bibliography 
See Bibliographic citations 
bin directory 
defined, GEN 4-64 


Binary date 
Mail program and, GEN 2-37 
Binary operator (C compiler) 
description, PGM 2-66 
Binary option (Mail) 
See Option (Mail) 
bind system call 
assigning socket name, SYS 3-7E 
binding names to sockets, SYS 
1-10 
specifying association, SYS 3-25 
Bit mask 
creating, SYS 3-11 
bl command (me) 
defined, GEN 5-44 
Blau, R., & Joyce, J. 
Edit tutorial, GEN 3-3 to 3-23 
Block device 
description, SYS 5-20 
Block map 
layout of blocks and fragments, 
SYS 1-27F 
Block of text 
footnotes and, GEN 5-36 
indenting from left and right, 
GEN 5-86E 
index entries and, GEN 5-36 
keeping together in text, GEN 
5-26 
Block size 
selecting, SYS 5-41 
Boldface 
entering, GEN 5-8 
Bootstrap monitor 
loading, SYS 5-65 to 5-68 
Bootstrap procedure 
booting from tape, SYS 5-22 
description, SYS 5-22 to 5-31 
details, SYS 5-59 to 5-64 
messages about console bootstrap 
cassette, SYS 5-71 
messages about the distributed 
console media, SYS 5-69 
messages about the distributed 
system, SYS 5-70 
Bootstrap program 
4.2BSD improvement, SYS 5-15 
loading, SYS 5-25 
Bourne shell 
background command, GEN 4-3E 
changing prompt, GEN 4-6 
command execution, GEN 4-23 to 
4-24 
command grammar, GEN 4-26 


Bourne shell (Cont.) 
command substitution and, GEN 
4-18 to 4-20 
command syntax, GEN 4-3 
defined, GEN 4-3 
description, GEN 4-3 to 4-27 
error handling, GEN 4-21 
error signals, GEN 4-21F 
fault handling, GEN 4-21 
group set and, SYS 1-8 
invoking, GEN 4-24 
prompt, GEN 4-6 
redirecting input, GEN 4-4 
redirecting output, GEN 4-3 
Bourne, S.R. 
introducing the UNIX shell, GEN 
4-3 to 4-27 
Bourne, S.R., & Maranzano, J.F. 
ADB debugging program, PGM 
3-51 to 3-77 
Box (nroff/troff) 
creating smallest, GEN 5-68 
box routine 
defined, PGM 4-81 
Boxing 
description, GEN 5-69 
entering, GEN 5-8 to 5-9 
bp command (me) 
See also pa command (me) 
specifying blank column, GEN 
5-35 
specifying page break, GEN 5-23 
bp command (nroff/troff) 
See also ns command (nroff/troff) 
defined, GEN 5-59 
br command (me) 
starting a line, GEN 5-24 
br command (nroff/troff) 
defined, GEN 5-60 
Braces 
argument expansion and, GEN 
4—-60E 
Braces (EQN) 
typesetting in proper size, GEN 
5-100E 
Brackets (Bourne shell) 
matching any: single character, 
GEN 4-34 
Brackets (DC) 
placing character string on stack, 
GEN 2-58 
Brackets (ed) 
appearing in character class, GEN 
3-41 
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Brackets (ed) (Cont.) 
deleting line numbers, GEN 3-41, 
3-41E 
Brackets (EQN) 
typesetting in proper size, GEN 
5-100E 
Brackets (Mail) 
beginning a line with, GEN 2-26 
Brackets (nroff/troff) 
creating, GEN 5-88E 
creating large, GEN 5-68 
BRANCH operator (C compiler) 
defined, PGM 2-65 
Break 
defined, GEN 5-22 
space and, GEN 5-23 
specifying, GEN 5-24 
break command (C shell) 
See also breaksw command (C 
shell) 
csh script and, GEN 4-58 
defined, GEN 4-64 
break statement (awk) 
defined, PGM 3-9 
break statement (BC) 
forming, GEN 2-54 
breaksw command (C shell) 
defined, GEN 4-64 
exiting from switch statement, 
GEN 4-58 
Broadcast message 
sending, SYS 3-27E 
Broadcast packet 
See also Broadcast message 
datagram sockets and, SYS 3-27 
Broken bar 
shell and, GEN 2-27 
BSS operator (C compiler) 
defined, PGM 2-64 
bss segment (as) 
See also Assignment statement 
(as) 
See also Location counter (as) 
description, GEN 6-54 
bss statement 
defined, GEN 6-59 
bstring library 
4.2BSD improvement, SYS 1-14 
btlgammon game 
See backgammon game 
buf.h file 
4.2BSD improvement, SYS 5-6 
Buffer 
defined, GEN 3-4 
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Buffer (Cont.) 
ed and, GEN 3-25 
writing part of, GEN 3-22 
Buffer (nroff/troff) 
flushing output buffer, GEN 5-73 
Buffer (vi) 
description, GEN 3-54 
system commands and, GEN 3-68 
types of, GEN 3-62 
BUFSIZ 
defined, PGM 1-21 
bugfiler program 
4.2BSD improvement, SYS 1-19 
Built-in (M4) 
See Command (M4) 
built-in command (C shell) 
defined, GEN 4-64 
bx command (me) 
boxing words, GEN 5-37 
defined, GEN 5-44 
byte statement (as) 
defined, GEN 6-59 
bzero library routine 
4.2BSD improvement, SYS 1-14 


C 


C argument (nroff) 
specifying, GEN 5-27 
c command (DC) 
descripton, GEN 2-58 
c command (ed) 
defined, GEN 3-34 
using, GEN 3-31 to 3-32 
c command (edit) 
description, GEN 3-18 
c command (ex) 
description, GEN 3-88 
C command (me) 
defined, GEN 5-46 
c command (me) 
centering blocks of text, GEN 
5-27 
defined, GEN 5-438, 5-46 
specifying a chapter without 
number, GEN 5-33 
specifying chapters, GEN 5-33 
c command (sed) 
defined, GEN 3-109 
C command (vi) 
defined, GEN 3-78 
C compiler 
description, PGM 2-63 to 2-77 
as programming tool, GEN 2-15 


C compiler (Cont.) 
replacing, SYS 5-118 
c escape (Mail) 
description, GEN 2-25 
C flag (lint) 
creating libraries from C source 
code, SYS 1-7 
c flag (mkey) 
specifying file of common words, 
GEN 5-147 
C library 
reinstalling, SYS 5-56E 
c macro (me) 
defined, GEN 5-46 
c number register (nroff/troff) 
defined, GEN 5-81 
c operator (vi) 
defined, GEN 3-80 
C option (hunt) 
defined, GEN 5-148 
C option (tar) 
forcing chdir operations in an 
operation, SYS 1-9 
c option (uucp) 
defined, SYS 5-132 
C preprocessor 
if statements and, SYS 1-5 
line numbers and, SYS 1-5 
C program 
debugging, PGM 3-53 to 3-58 
C programming language 
See also M4 macro processor 
CAI script for, GEN 6-7 
command line format, PGM 1-3 
computers supporting, GEN 2-15 
programming in, GEN 2-14 to 
2-15 
reference manual, PGM 2-5 to 
2-35 
supporting programs, GEN 2-15 
C Programming Language Reference 
Manual, The, PGM 2-5 to 2-35 
See also C programming language 
C shell 
4,.2BSD improvement, SYS 1-5 
built-in commands, GEN 4-50 to 
4-52 
compared to other command 
interpreters, GEN 4-30 
defined, GEN 4-29 
details for terminal users, GEN 
4-39 to 4-52 
history list and, GEN 4-41 
interrupts and, GEN 4-36 


C shell (Cont.) 
introduction, GEN 4-29 to 4-74 
logging in, GEN 4-39 
metacharacters and, GEN 4-32 
overwriting files and, GEN 4-41 
purpose of, GEN 4-29 
using from the terminal, GEN 
4-30 to 4-38 
C shell variables 
description, GEN 4-40 to 4-41 
set command and, GEN 4-40E 
c2 command (nroff/troff) 
defined, GEN 5-67 
CAI script, GEN 6-9E to 6-11E 
description, GEN 6-6 to 6-7 
prerequisites, GEN 6-6 
prerequisites for the writer, GEN 
6-8 
types of, GEN 6-7 
Campbell, R. 
line printer spooling system 
(4.2BSD), PGM 4-99 to 4-105 
CANBSIZ parameter 
description, SYS 5-121 
canfield game 
See also cfscores program 
4.2BSD improvement, SYS 1-17 
Carbon copy 
See CC: list 
Caret 
See Circumflex character (ed) 
case branch 
description, GEN 4-8 to 4-9 
form of, GEN 4-8E 
case command (C shell) 
defined, GEN 4-64 
cat command (C shell) 
collecting files, PGM 1-5E 
combining files, GEN 3-48, 3-48E 
defined, GEN 4-64 
listing system users, GEN 4-35E 
printing files, GEN 2-7 
printing merged files, GEN 2-11 
printing pipeline information, 
GEN 2-11 
terminating, GEN 4-36 
cat program 
See cat command (C shell) 
CBRANCH operator (C compiler) 
defined, PGM 2-66 
cc 
dbx and, SYS 1-5 
ce command (nroff/troff) 
defined, GEN 5-67 


Index-9 


CC: list 
See also askcc option 
adding people to, GEN 2-25 
cctab table 
defined, PGM 2-68 
cd command (C shell) 
See also pushd command (C shell) 
changing working directory, GEN 
2-10 | 
defined, GEN 4-64 
description, GEN 2-29 
working directory and, GEN 4-48 
ce command (me) 
entering, GEN 5-24 
ce command (nroff/troff) 
defined, GEN 5-61 
Cedilla 
See Metacharacters 
Centering 
blocks of text, GEN 5-27, 5-61 
specifying, GEN 5-24 
ch command (nroff/troff) 
defined, GEN 5-65 
Change bars (nroff/troff) 
specifying, GEN 5-72 
change command (ed) 
See c command (ed) 
change command (edit) 
See c command (edit) 
change command (ex) 
See c command (ex) 
change directory command 
See cd command (C shell) 
Changequote command (M4) 
description, PGM 2-395E 
Chapter 
formatting, GEN 5-33 
inserting in table of contents 
automatically, GEN 5-46 
specifying page numbers, GEN 
5-46 
specifying without number, GEN 
5-33 
Chapter-oriented document 
formatting, GEN 5-34F 
Character class 
circumflex within, GEN 3-42 
defined, GEN 3-41 
forming, GEN 3-33E 
lowercase letters and, GEN 3-41 
number ranges and, GEN 3-41 
special characters and, GEN 3-41 
specifying exceptions, GEN 3-42 
uppercase letters and, GEN 3-41 
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chase game 
obsolete, SYS 1-17 
chdir command (C shell) 
See cd command (C shell) 
Cherry, L., & Morris, R. 
BC and, GEN 2-438 to 2-55 
DC and, GEN 2-57 to 2-64 
Cherry, L.L., & Kernighan, B.W. 
typesetting mathematics, GEN 
5-97 to 5-104 
Typesetting Mathematics - User’s 
Guide, GEN 5-105 to 5-114 
Cherry, L.L., & Vesterman, W. 
style and diction programs, GEN 
5-163 to 5-177 
chfn 
4.2BSD improvement, SYS 1-5 
chgrp 
4.2BSD improvement, SYS 1-5 
ching game 
4.2BSD improvement, SYS 1-17 
chmod command (Bourne shell) 
making a file executable, GEN 
4-7K 
marking executable files, GEN 
2-12 
chsh command (C shell) 
defined, GEN 4-64 
CHSHR file 
incoming mail and, GEN 2-17 
chshrc file 
putting into effect before next 
login, GEN 4-51 
Circle 
See Metacharacters 
Circumflex (edit) 
searching and, GEN 3-20 
Circumflex character (ed) 
at beginning of line and, GEN 
3-40 
meaning, GEN 3-33 
uses, GEN 3-40 
Circumflex character (me) 
See Metacharacters 
clear routine 
defined, PGM 4-81 
clearok routine 
defined, PGM 4-81 
Client process 
See also Server process 
description, SYS 3-19 
Clist segment 
setting number, SYS 5-122 


close function 
description, PGM 1-11 
clrtoeol routine 
defined, PGM 4-81 
cmp program 
defined, GEN 4-64 
co command (edit) 
description, GEN 3-15 
co command (ex) 
description, GEN 3-88 
Code generation (C compiler) 
description, PGM 2-68 to 2-76 
matching table entries against 
trees, PGM 2-69 
Column 
specifying, GEN 5-43 
specifying headers for continuing 
pages, GEN 5-42 
specifying headers for continuing 
pages with a macro, GEN 
5-75E 
specifying in text file, GEN 5-6 
starting, GEN 5-35 
text formatting commands for 
double columns, GEN 5-15E, 
5-35 
Comma character (ed) 
compared with semicolon, GEN 
3-45 
COMMA operator (C compiler) 
defined, PGM 2-66 
Command (Bourne shell) 
See also specific commands 
grammar, GEN 4-26 
grouping, GEN 4-14 
Command (C shell) 
See also Program 
See also specific commands 
defined, GEN 4-64 
reference list, GEN 4-63 to 4-74 
regenerating, SYS 5-118 
repeating, GEN 4-41 to 4-48, 
4-51K 
substituting output for, GEN 
4-61E 
suspending temporarily, GEN 
4-36 
terminating, GEN 4-35 to 4-38 
typing, GEN 2-4 
within quotation marks, GEN 
4-60 
Command (DC) 
See also specific commands 
for human use 


Command (DC) 
for human use (Cont.) 
reference list, GEN 2-57 to 2-59 
how they work, GEN 2-57 
Command (ed) 
See also specific commands 
description, GEN 3-25 
reference list, GEN 3-34 
Command (ex) 
See also specific commands 
addressing primitives, GEN 3-87 
combining addressing primitives, 
GEN 3-87 
exceeding thresholds, GEN 3-86 
reference list, GEN 3-87 to 3-96 
structure of, GEN 3-86 
syntax, GEN 3-87E 
Command (M4) 
See also specific commands 
reference list, PGM 2-398 
Command (Mail) 
See also specific commands 
reference list, GEN 2-28 to 2-33, 
2-39T 
Command (make) 
defined, PGM 3-16 
Command (nroff) 
description, GEN 5-22 to 5-25 
Command (nroff/troff) 
See also specific commands 
reference list, GEN 5-51 
Command (vi) 
See also specific commands 
case and, GEN 3-59 
ex 3.5 changes and, GEN 3-103 
for file manipulation, GEN 3-71T 
preceding counts and, GEN 3-70 
Command file 
description, GEN 1-29 
Command line 
running two programs with one, 
GEN 2-11 
Command line flag (Mail) 
See Flag (Mail) 
Command mode (ex) 
defined, GEN 3-85 
Command name 
defined, GEN 4-64 
Command procedure 
See Shell procedure 
Command substitution 
See also Modifier (C shell) 
defined, GEN 4-65 
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Command-list 
defined, GEN 4-8 
grouping commands, GEN 4-14 
Comment (awk) 
defined, PGM 3-9 
Comment (BC) 
convention, GEN 2-49, 2-50 
Comment (ex) 
description, GEN 3-86 
Comment (nroff/troff) 
specifying, GEN 5-67 
Communication domain 
defined, SYS 3-6 
Component 
defined, GEN 4-65 
Compound statement (BC) 
forming, GEN 2-54 
Computer-aided instruction 
See CAI scripts 
comsat program 
4.2BSD improvement, SYS 1-19 
CON operator (C compiler) 
defined, PGM 2-66 
Conditional 
See if/endif commands 
conf.c file 
4.2BSD improvement, SYS 5-14 
installing device driver and, SYS 
5-119 
conf.h file 
4.2BSD improvement, SYS 5-6 
config program 
4.2BSD improvement, SYS 1-19 
adding nonstandard system 
facilities, SYS 5-96 
defined, SYS 5-73 
description, SYS 5-73 to 5-105 
device defaults, SYS 5-99 to 5-100 
files generated by, SYS 5-76 
modifying system code, SYS 5-88 
modifying system configuration, 
SYS 5-76 
prerequisite information, SYS 
5-74 
profiled systems and, SYS 5-78 
specifying options items, SYS 
5-75 
Configuration clause 
description, SYS 5-80 
Configuration file 
contents, SYS 5-76 
creating, SYS 5-76 
grammar, SYS 5-97 to 5-98 
specifying devices, SYS 5-81 
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Configuration file (Cont.) 
specifying multiple bootable 
images, SYS 5-80 
syntax, SYS 5-79 to 5-83 
VAX-11/780 sample, SYS 5-84 to 
5-87 
connect system call 
datagram sockets and, SYS 3-10 
errors, SYS 3-8 
establishing connection between 
sockets, SYS 1-10 
initiating connection, SYS 3-8E 
Connect time accounting 
summarizing, SYS 5-56 
Connection 
accepting, SYS 3-9E 
receiving, SYS 3-8 to 3-9 
Constant (BC) 
defined, GEN 2-50 
Context search (ed) 
backslash character and, GEN 
3-43 
defined, GEN 3-35 
methods, GEN 3-30 to 3-31 
question mark character and, 
GEN 3-48 
repeating a search, GEN 3-31 
reverse order and, GEN 3-31 
slashes and, GEN 3-39 
Context search (edit) 
d command and, GEN 3-16 
delete command and, GEN 3-16C 
move command and, GEN 3-15 
repeating, GEN 3-20E 
reversing, GEN 3-20 
s command and, GEN 3-20 
continue command (C shell) 
defined, GEN 4-65 
continue statement (awk) 
defined, PGM 3-9 
Control character (C shell) 
defined, GEN 4-65 
Control character (nroff/troff) 
changing, GEN 5-67 
commands and, GEN 5-56 
Control character (vi) 
in text file, GEN 3-61 
Control statement (BC), GEN 
2-47K 
description, GEN 2-47 to 2-48 
Cooper, E., & others 
4.2BSD System Manual, PGM 
4-15 to 4-52 


copy command (C shell) 
See cp command (C shell) 
copy command (edit) 
See co command (edit) 
copy command (ex) 
See co command (ex) 
copy command (Mail) 
See also save command (Mail) 
description, GEN 2-29 
using, GEN 2-23E 
copy program 
loading, SYS 5-24E 
mini-root file system and, SYS 
5-24 
Core dump file 
defined, GEN 4-65 
program faults and, GEN 1-31 
terminating a program and, GEN 
4-37 
Cover sheet 
entering in text file, GEN 5-5 
formatting commands, GEN 5-5E 
cp command (C shell) 
4.2BSD improvement, SYS 1-5 
copying a file, GEN 2-7K, 3-47 
defined, GEN 4-65 
saving a file, GEN 3-47E 
cpu type parameter (config) 
defined, SYS 5-79 
CR key 
See RETURN key 
Crash 
recovering files after, GEN 3-22 
creat function 
description, PGM 1-10 
creat system call 
obsolete in 4.2BSD, SYS 1-10 
cref program 
defined, GEN 2-13 
crmode routine 
defined, PGM 4-84 
crt option (Mail) 
paging mail, GEN 2-20 
type command and, GEN 2-32 
crt0.ex file 
4.2BSD improvement, SYS 5-13 
cs command (troff) 
defined, GEN 5-58 
csh program 
See C shell 
cshrc file 
defined, GEN 4-65 
logging in and, GEN 4-39 


CSPACE operator (C compiler) 

defined, PGM 2-64 
css network driver 

4.2BSD improvement, SYS 1-15 
ctags 

4.2BSD improvement, SYS 1-5 
ctime library 

4.2BSD improvement, SYS 1-14 
CTRL-B 

defined, GEN 3-75 

description, GEN 3-56 
CTRL-C 

ULTRIX-32 and, GEN 2-1 
CTRL-D 

See also CTRL-U 

defined, GEN 3-75 

description, GEN 3-56 
CTRL-E 

defined, GEN 3-75 

description, GEN 3-56 
CTRL-F 

defined, GEN 3-75 

description, GEN 3-56 
CTRL-G 

defined, GEN 3-75 

vi and, GEN 3-57 
CTRL-H 

See also At sign 

See also u command (edit) 

defined, GEN 3-75 

deleting characters, GEN 3-7 
CTRL-J 

defined, GEN 3-75 
CTRL-L 

defined, GEN 3-75 
CTRL-M 

defined, GEN 3-75 
CTRL-N 

defined, GEN 3-75 
CTRL-P 

defined, GEN 3-76 
CTRL-R 

defined, GEN 3-76 
CTRL-U 

See also CTRL-D 

defined, GEN 3-76 

description, GEN 3-56 

ULTRIX-32 and, GEN 2-1 
CTRL-Y 

defined, GEN 3-76 

description, GEN 3-56 
CTRL-Z 

defined, GEN 3-76 
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cu command (nroff) 

defined, GEN 5-67 
cu program 

See tip program 
Current line 

printing, GEN 3-11E 
curses library 

4.2BSD improvement, SYS 1-14 
Cursor motion optimization 

stand alone, PGM 4-78 to 4-80 
Cursor positioning key 

terminals and, GEN 3-55 
Cut mark 

specifying for troff, GEN 5-74E 
Cutting and pasting 

See cp command (ed) 

See m command (ed) 

See mv program (ed) 

with ed, GEN 3-49 to 3-51 

with UNIX commands, GEN 3-47 

to 3-49 

ewd variable (C shell) 

defined, GEN 4-65 

working directory and, GEN 4-41 
Cylinder group 

description, SYS 1-26, 2-8 
Czech 

See Metacharacters 


D 


d command (DC) 
descripton, GEN 2-58 

d command (ed) 
defined, GEN 3-34 
using, GEN 3-29 

d command (edit) ; 
context search and, GEN 3-16 
description, GEN 3-15 

d command (ex) 
description, GEN 3-88 

d command (me) 
defined, GEN 5-43 

d command (sed) 
defined, GEN 3-108 

D command (vi) 
defined, GEN 3-78 

d escape (Mail) 
description, GEN 2-24 

d flag (Mail) 
See also debug option 
debugging information and, GEN 

2-36 
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d flag (make) 
defined, PGM 3-17 
d operator (vi) 
defined, GEN 3-80 
d option (inv) 
defined, GEN 5-147 
d option (uucico) 
defined, SYS 5-135 
d option (uuclean) 
defined, SYS 5-137 
d option (uucp) 
defined, SYS 5-131 
DA command (ms) 
specifying date on text page, GEN 
5-9 
da command (nroff/troff) 
defined, GEN 5-65 
Daisy wheel printer 
setting for 12-pitch, GEN 5-39 
DARPA File Transfer Protocol 
server program 
See ftpd program 
DARPA Internet 
network architecture support, SYS 
1-15 
DARPA Internet protocol 
support, SYS 5-47 
DARPA Request For Comments 
#833 
sendmail and, SYS 1-4 
DARPA Simple Mail Transfer 
Protocol 
sendmail and, SYS 1-4 
DARPA TELNET protocol 
See telnetd server program 
DARPA Trivial File Transfer 
Protocol 
See tftpd server program 
Dash 
specifying em dash, GEN 5-47 
Data block 
kinds of, SYS 2-12 
Data file 
defined, SYS 5-131 
DATA operator (C compiler) 
defined, PGM 2-64 
Data segment (as) 
description, GEN 6-54 
data statement 
defined, GEN 6-59 
Data Translation A/D converter 
See ad driver 
Datagram socket 
See also Raw socket 


Datagram socket (Cont.) 
creating for on-machine use, SYS 
3-7E 
defined, SYS 3-6 
description, SYS 3-10 
sending broadcast packets on 
networks, SYS 3-27 
Date 
specifying with -me, GEN 5-47 
specifying with -ms, GEN 5-9 
date command (C shell) 
defined, GEN 4-65 
using, GEN 2-4 
dbx symbolic debugger 
description, SYS 1-4 
Pascal compiler pe and, SYS 1-8 
DC program 
See also BC language 
defined, GEN 2-57 
description, GEN 2-57 to 2-64 
internal arithmetic and, GEN 
2-60 
programming, GEN 2-62 
de command (nroff/troff) 
See also ig command (nroff/troff) 
defined, GEN 5-64 
defining macros, GEN 5-89E 
Dead.letter file, GEN 2-24 
canceling mail and, GEN 2-18 
debug option (Mail) 
See also -d flag 
defined, GEN 2-34 
Debugging 
defined, GEN 4-65 
DecWriter III printer 
setting for serial lines, PGM 
4-101E 
Default 
defined, GEN 4-65 
define command (M4) 
description, PGM 2-393 to 2-395 
define keyword (BC), GEN 2-46E 
define program (EQN) 
description, GEN 5-100 
define statement (BC) 
forming, GEN 2-55 
delay routine 
description, PGM 2-76 
Delayed text 
defined, GEN 5-28 
delch routine 
defined, PGM 4-82 
delete command (ed) 
See d command (ed) 


delete command (edit) 
See d command (edit) 
delete command (ex) 
See d command (ex) 
delete command (Mail) 
See also autoprint option (Mail) 
See also dt command (Mail) 
See also undelete command 
(Mail) 
abbreviating, GEN 2-20 
description, GEN 2-29 
keeping message from mbox, GEN 
2-20E 
DELETE key 
defined, GEN 4-65 
description, GEN 3-55 
ULTRIX-32 and, GEN 2-1 
deleteln routine 
defined, PGM 4-82 
delivermail program 
See sendmail program 
delwin routine 
defined, PGM 4-85 
DES encryption algorithm 
chips and, SYS 4-11 
Description file (make), PGM 3-14E 
See also -f flag (make) 
description, PGM 3-15 to 3-16 
Detached command 
defined, GEN 4-65 
Device driver 
converting local to 4.2BSD, SYS 
5-4 
CSR value list, SYS 5-61 
I/O system and, PGM 4-67 to 
4-73 
installing new, SYS 5-119 
prerquisites, SYS 5-89 
Device name 
convention, SYS 5-19 
devices.vax file 
4.2BSD improvement, SYS 5-11 
df 
reporting disk space in kilobytes, 
SYS 1-5 
dh.c device driver 
4.2BSD improvement, SYS 5-12 
di command (nroff/troff) 
defined, GEN 5-64 
diverting output to a macro, GEN 
5-94 
Diacritical marks 
available 
reference list, GEN 5-19 
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Diacritical marks (Cont.) 
entering with EQN, GEN 5-100 
Diagnostic 
defined, GEN 4-65 
Diagnostic output 
redirecting, GEN 4-44E 
Dial-up network 
description, SYS 5-123 to 5-129 
operation, SYS 5-124 
processing, SYS 5-125 to 5-126 
protocol and, SYS 5-124, 5-126 
security, SYS 5-125 
starting your network, SYS 5-128 
transmission speed, SYS 5-127 
uses, SYS 5-126 
Diction program 
See also Style program 
description, GEN 5-163 to 5-177 
diff utility 
comparing files, GEN 2-13 
dir 
4.2BSD improvement, SYS 1-16 
dir.h file 
4.2BSD improvement, SYS 5-6 
directories command 
See dirs command (C shell) 
Directory 
See also Home directory 
See also Root directory 
See also Working directory 
allocating, SYS 1-33 
alternate name for, GEN 2-10 
changing, GEN 2-10 
changing working directory, GEN 
2-10 
creating, GEN 2-10 
defined, GEN 4-66, PGM 4-10 
description, GEN 1-21, 2-9 
determining, GEN 2-10 
listing basic, GEN 2-9 
moving up one level, GEN 2-10E 
organization changes for 4.2BSD, 
SYS 5-4 
project-related, GEN 4-48 
removing, GEN 2-10E 
security of, SYS 4-4 
Directory data block 
defined, SYS 2-12 
directory library 
4.2BSD improvement, SYS 1-14 
directory option (ex) 
description, GEN 3-98 
Directory stack 
defined, GEN 4-66 
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dirs command (C shell) 
See also pwd command (C shell) 
compared with pwd, GEN 4-49 
defined, GEN 4-66 
saving name of previous directory, 
GEN 4-49 
Disk 
balancing load, SYS 5-39 
configuring load, SYS 5-37 to 5-43 
defined, GEN 3-4 
dividing into partitions, SYS 5-38 
formatting, SYS 5-22 to 5-24 
reporting space in kilobytes, SYS 
1-5 
reporting usage in kilobytes, SYS 
1-5 
space limits, SYS 4-3 
space per device, SYS 5-38, 5-39T 
Disk bandwith 
4.2BSD improvement, SYS 1-3 
Disk driver 
UNIX implementation and, PGM 
4-9 
Disk partition 
description, SYS 5-19 
sizes, SYS 5-38 
Disk quota 
4.2BSD improvement, SYS 1-18 
disabling, SYS 2-4 
enabling, SYS 2-4 
enforcing, SYS 5-57 
per filesystem, SYS 1-4 
per user, SYS 1-4 
recovering from over quota 
condition, SYS 2-3 
restricting, SYS 1-35 
setting, SYS 2-4 
types of, SYS 2-3 
Disk quota system 
configuration requirement, SYS 
5-57 
description, SYS 2-3 to 2-5 
establishing, SYS 2-4 
history, SYS 2-5 
including, SYS 2-4E 
programs, SYS 5-57 
diskpart program 
4.2BSD improvement, SYS 1-19 
disktab file 
4.2BSD improvement, SYS 1-16 
Display (nroff) 
defined, GEN 5-25, 5-42 
description, GEN 5-25 to 5-27 
specifying in fill mode, GEN 5-26 


Display (nroff) (Cont.) 
text formatting commands for, 
GEN 5-15E 
distrib routine 
description, PGM 2-68 
Distribution tape 
constructing, SYS 5-59 to 5-61 
contents, SYS 5-59T 
Diversion (troff) 
description, GEN 5-94 
divert command (M4) 
description, PGM 2-396 
Division 
DC and, GEN 2-61 
divnum command (M4) 
description, PGM 2-396 
DL-11W 
See kg driver 
dmc network interface driver 
4.2BSD improvement, SYS 1-15 
DMC-11/DMR-11 point-to-point 
communications device 
See dmc network interface driver 
dmf.c device driver 
4.2BSD improvement, SYS 5-12 
dnl command (M4) 
description, PGM 2-397 
Document preparation 
description, GEN 2-12 to 2-14 
hints, GEN 2-13 to 2-14 
reading list, GEN 2-16 
DOD Standard TCP/IP network 
communication protocols 
support for, SYS 1-3 
Dollar sign character (ed) 
end of line and, GEN 3-39 
meaning, GEN 3-33, 3-40 
p command and, GEN 3-28 
printing value, GEN 3-35. 
Dollar sign character (edit) 
equal sign and, GEN 3-17 
printing last buffer line, GEN 
3-17 
searching and, GEN 3-20 
domain.h file 
4.2BSD improvement, SYS 5-5 
don’t command (sed) 
defined, GEN 3-113 
Dot character (C shell) 
at beginning of file, GEN 4-34 
defined, GEN 4-63 
separating filename components, 
GEN 4-33 


Dot character (ed) 
determining value, GEN 3-29E 
equal sign and, GEN 3-35 
line number defaults and, GEN 
3-44 to 3-45 
meaning, GEN 3-38, 3-39 
meaning for context searching, 
GEN 3-33 
p command and, GEN 3-28 
printing, GEN 3-39 
s command and, GEN 3-29 
setting with semicolon, GEN 3-45 
to 3-46 
using, GEN 3-28, 3-33 
Dot character (edit) 
equal sign and, GEN 3-17 
uses, GEN 3-17 
Dot character (nroff/troff) 
See Control character (nroff/troff) 
specifying lines of, GEN 5-88 
dot option (Mail) 
See also ignoreof option 
defined, GEN 2-34 
Doublespacing 
specifying, GEN 5-23 
drtest program 
4.2BSD improvement, SYS 1-19 
DS command (ms) 
specifying line breaks, GEN 5-8 
ds command (nroff/troff) 
defined, GEN 5-64 
defining strings, GEN 5-89 
DSTFLAG parameter 
description, SYS 5-122 
dt command (Mail) 
description, GEN 2-29 
dt command (nroff/troff) 
defined, GEN 5-65 
du command (C shell) 
defined, GEN 4-66 
reporting disk usage in kilobytes, 
SYS 1-5 
du program 
See du command (C shell) 
dump program | 
See also rdump program 
4.2BSD improvement, SYS 1-16, 
1-19 
using, SYS 5-53 
dumpdef command (M4) 
description, PGM 2-397 
dumpfs program 
4.2BSD improvement, SYS 1-19 
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Dungeons of doom 
See Rogue game 

Dynamic string storage allocator 
See Allocator 


E 


e command (ed) 
defined, GEN 3-34 
using, GEN 3-27, 3-49E 
e command (edit) 
copying a file, GEN 3-14 
r option and, GEN 3-23 
u command and, GEN 3-16 
e command (ex) 
description, GEN 3-88 
E command (vi) 
defined, GEN 3-79 
e command (vi) 
defined, GEN 3-80 
e escape (Mail) 
description, GEN 2-24 
e flag (sed) 
defined, GEN 3-106 
e modifier (C shell) 
extracting filename extension, 
GEN 4-57E 
e option (nroff) 
defined, GEN 5-50 
ec command (nroff/troff) 
defined, GEN 5-66 
ec network interface driver 
4.2BSD improvement, SYS 1-15 
echo command (C shell) 
defined, GEN 4-66 
echo routine 
defined, PGM 4-84 
ed line editor 
See also edit line editor 
See also ex line editor 
accessing, GEN 3-25 
adding text, GEN 3-25 
addressing lines, GEN 3-43 to 


3-46 

advanced editing, GEN 3-37 to 
3-52 

backslash character and, GEN 
3-33 


breaking lines, GEN 3-42 

CAI script for, GEN 6-7 

changing text, GEN 3-31 to 3-32 

command summary, GEN 3-34 

context searching, GEN 3-380 to 
3-31 
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ed line editor (Cont.) 
copying lines, GEN 3-51 
creating text, GEN 3-25 
deleting text, GEN 3-29 
description, GEN 2-6 
escaping to use UNIX command, 
GEN 3-51 
global commands, GEN 3-32 
inserting text, GEN 3-31 to 3-32 
interrupting, GEN 3-46 
introduction, GEN 3-25 to 3-35 
joining lines, GEN 3-42 
line number defaults, GEN 3-44 
to 3-45 
marking a line, GEN 3-50 
moving text, GEN 3-32, 3-50 
printing a file, GEN 2-7 
printing lines, GEN 3-27 
reading a file, GEN 3-27 
rearranging a line, GEN 3-43 
repeating searches, GEN 3-44 
searching for first occurrence of 
text string, GEN 3-46 
sed and, GEN 3-105 
setting dot, GEN 3-45 to 3-46 
specifying lines with text patterns, 
GEN 3-46 to 3-47 
specifying the second occurrence 
of text string, GEN 3-46 
substituting text, GEN 3-29 
supporting tools, GEN 3-51 to 
3-52 
using special characters, GEN 
3-33 
writing a file, GEN 3-26 
ed.hup file 
saving text, GEN 2-6 
edcompatible option (ex) 
description, GEN 3-98 
edit command (ed) 
See e command (ed) 
edit command (edit) 
See e command 
edit command (ex) 
See e command (ex) 
edit command (Mail) 
See also visual command (Mail) 
description, GEN 2-29 
edit line editor 
See also ed line editor 
See also ex line editor 
accessing, GEN 3-5 to 3-6 
adding text, GEN 3-9 
correcting text, GEN 3-9 


edit line editor (Cont.) 

current line and, GEN 3-11 

defined, GEN 3-3 

entering text, GEN 3-6 

ex editor and, GEN 3-23 

finding a line, GEN 3-11E 

issuing UNIX command from, 
GEN 3-21 

messages, GEN 3-6 

moving around in the buffer, GEN 
3-17 

opening a file, GEN 3-9E, 3-14E 

prerequisites, GEN 3-3 

printing current line number, 
GEN 3-11 

printing nonprinting characters, 
GEN 3-10 

quitting, GEN 3-8 

reversing last command, GEN 
3-16 

saving modified text, GEN 3-13 

searching for characters, GEN 


3-10, 3-10E 
tutorial, GEN 3-3 to 3-23 
Editing 
hints for, GEN 2-13 
Editor 


See ed editor 
See edit editor 
See ex editor 
See Screen editor 
See sed stream editor 
See vi screen editor 
EDITOR option (Mail) 
defined, GEN 2-33 
setting, GEN 2-33 
specifying an editor, GEN 2-24 
edquota program 
4.2BSD improvement, SYS 1-19 
ef command (me) 
defined, GEN 5-41 
efftab table 
defined, PGM 2-68 
EFL programming language 
description, PGM 2-123 to 2-157 
eh command (me) 
defined, GEN 5-41 
el command (nroff/troff) 
defined, GEN 5-71 
else command (C shell) 
See also if/endif commands (C 
shell) 
See also then command (C shell) 
defined, GEN 4-66 


else command (Mail) 
See also if/endif commands (Mail) 
description, GEN 2-30 
else statement (awk) 
defined, PGM 3-9 
Elz, R. 
disk quota system, SYS 2-3 to 2-5 
em 
defined, GEN 5-86 
em command (nroff/troff) 
defined, GEN 5-65 
Em dash 
in nroff/troff output, GEN 5-19 
Emphasis 
See Boldface 
See Italic 
See Overstriking 
See Underlining 
en network interface driver 
4,.2BSD improvement, SYS 1-16 
enable/disable command (Ipc) 
description, PGM 4-103 
endif command (C shell) 
See if/endif commands (C shell) 
endif command (Mail) 
See if/endif commands (Mail) 
endif statement (as) 
See if/endif statement (as) 
endwin routine 
defined, PGM 4-85 
Entry file 
defined, GEN 5-145 
Environment (C shell) 
displaying, GEN 4-51E 
Environment (nroff/troff) 
description, GEN 5-71, 5-94 
eo command (nroff/troff) 
defined, GEN 5-66 
EOF (End of File) 
defined, GEN 2-5, 4-66 
EOF operator (C compiler) 
defined, PGM 2-64 
EOF value 
defined, PGM 1-21 
description, PGM 1-4 
ep command (me) 
defined, GEN 5-42 
EQ command (EQN) 
specifying continuation, GEN 5-35 
specifying equations, GEN 5-34 
supplementing with troff 
commands, GEN 5-101 
EQ command (me) 
defined, GEN 5-45 
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EQ command (ms) 
specifying equations, GEN 5-10 
EQN program 
See also NEQN program 
CAI script for, GEN 6-7 
connecting output to troff, GEN 
5-101 
deficiencies, GEN 5-102 
defined, GEN 5-105 
description, GEN 5-33, 5-97 to 


5-104 

forcing extra white space, GEN 
5-99 

formatting mathematics, GEN 
2-13 


grammar, GEN 5-101 
language design, GEN 5-98 
language theory, GEN 5-101 
quoting an input string, GEN 
5-100 
Equal sign (ed) 
dot character and, GEN 3-35 
Equation 
continuing, GEN 5-35E 
formatting, GEN 5-33 
numbering, GEN 5-34 
setting with -ms, GEN 5-10 
text formatting commands for, 
GEN 5-16E 
Erase character 
See also Backspace character 
default, GEN 4-30 
erase routine 
defined, PGM 4-82 
errno cell 
description, PGM 1-12 
errno.h file 
4.2BSD improvement, SYS 5-5 
error 
troff messages and, SYS 1-5 
error bells option (ex) 
description, GEN 3-98 
Error condition (fsck) 
conventions, SYS 2-14 
Error log file 
examining, SYS 5-53 
Error message (ed) 
description, GEN 3-26 
errprint command (M4) 
description, PGM 2-397 
Escape character (Mail) 
changing, GEN 2-26 
Escape character (nroff/troff) 
description, GEN 5-66 
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Escape character(C shell) 
defined, GEN 4-66 
escape command 
See ! command (ed) 
ESCAPE key 
description, GEN 3-55 
escape option (Mail) 
changing escape character, GEN 
2-26 
defined, GEN 2-34 
Escape sequence (nroff/troff) 
reference list, GEN 5-54 
ev command (nroff/troff) 
changing environment, GEN 5-94 
description, GEN 5-72 
eval command (M4) 
description, PGM 2-396 
Evans and Sutherland Picture 
System 2 
See ps.c device driver 
EVEN operator (C compiler) 
defined, PGM 2-64 
even statement (as) 
defined, GEN 6-59 
ex command (ex) 
See e command (ex) 
ex command (nroff/troff) 
defined, GEN 5-72 
ex line editor 
See also ed line editor 
See also edit line editor 
See also sed stream editor 
See also vi screen editor 
3.5 changes, GEN 3-102 
command line format, GEN 3-83 
editing modes, GEN 3-85 
encryption code and, GEN 3-102 
entering multiple commands on a 
line, GEN 3-86 
errors and, GEN 3-85 
file manipulation, GEN 3-84 to 
3-85 
limitations, GEN 3-101 
printing current line number, 
GEN 3-95 
printing version number, GEN 
3-94 
recovering from crash, GEN 3-85 
recovering work, GEN 3-85E 
reference manual, GEN 3-83 to 
3-104 
starting, GEN 3-83 
vi and, GEN 3-73 


Ex Reference Manual, GEN 3-83 to 
3-104 
See also ex line editor 
Examples 
entering with troff, GEN 5-89 
Exception word list (nroff/troff) 
specifying, GEN 5-69 
Exclamation mark (C shell) 
using in command arguments, 
GEN 4-35 
Exclamation mark character (ed) 
shell command and, GEN 3-35 
Exclamation mark character (edit) 
shell command and, GEN 3-21 
Exclusive lock 
process and, SYS 1-3 
execl function 
See also execv 
See also fork function 
description, PGM 1-13 
Execute file 
defined, SYS 5-133 to 5-134 
execv routin 
description, PGM 1-13 
exit command (C shell) 
defined, GEN 4-66 
exit command (Mail) 
description, GEN 2-30 
exit function 
error handling and, PGM 1-8 
exit statement (awk) 
defined, PGM 3-9 
exit status 
defined, GEN 4-66 
exp function (awk) 
defined, PGM 3-8 
Expansion 
defined, GEN 4-67 
Exponentiation 
DC and, GEN 2-61 
Exponentiation operator 
description, GEN 2-52 
EXPR operator (C compiler) 
defined, PGM 2-65 
Expression 
defined, GEN 4-67 
Expression (as) 
defined, GEN 6-56 
types of 
reference list, GEN 6-57 
Expression (BC) 
See also Primitive expression 
defined, GEN 2-50 to 2-53 
length, GEN 2-51 


Expression (C shell) 
evaluating, GEN 4-55 
Expression operator (as) 
reference list, GEN 6-57 
Expression statement (as) 
defined, GEN 6-55 
Expression statement (BC) 
description, GEN 2-54 
Extended Fortran Language 
See EFL programming language 
Extension 
defined, GEN 4-67 
External security code 
password security and, SYS 4-12 
eyacc 
4,2BSD improvement, SYS 1-5 


F 


F argument (nroff) 
specifying fill mode, GEN 5-26 
f command (ed) 
defined, GEN 3-34 
determining the filename, GEN 
3-49 
renaming a file, GEN 3-49E 
f command (edit) 
description, GEN 3-21 
f command (ex) 
description, GEN 3-89 
f command (me) 
defined, GEN 5-43 
entering, GEN 5-28 
f command (troff) 
mixing fonts within a line, GEN 
5-86 
mixing fonts within a word, GEN 
5-86 
F command (vi) 
defined, GEN 3-79 
using, GEN 3-61 
f command (vi) 
defined, GEN 3-80 
using, GEN 3-61 
f flag (Mail) 
defined, GEN 2-36 
reading mail from specified file, 
GEN 2-21 
f flag (make) 
defined, PGM 3-17 
f flag (mkey) 
reading file list, GEN 5-147 
f flag (sed) 
defined, GEN 3-106 
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f flag (su) 
fast su and, SYS 1-9 
f macro (me) 
defined, GEN 5-42 
F option (hunt) 
defined, GEN 5-148 
f option (troff) 
defined, GEN 5-50 
f77 I/O library 
4.2BSD improvement, SYS 1-6 
description, PGM 2-79 to 2-88 
error messages, PGM 2-85 to 2-87 
exceptions to ANSI standard, 
PGM 2-88 
Fabry, R., & others 
4.2BSD System Manual, PGM 
4-15 to 4-52 
Fabry, R.S., & others 
4.2BSD Interprocess 
Communication Primer, SYS 
3-5 to 3-28 
fast file system, SYS 1-23 to 1-38 
networking implementation notes, 
SYS 3-29 to 3-57 
factor program 
4.2BSD improvement, SYS 1-17 
fastboot script 
See also fasthalt script 
4.2BSD improvement, SYS 1-19C 
fasthalt script 
See also fastboot script 
4.2BSD improvement, SYS 1-19 
fe command (nroff/troff) 
defined, GEN 5-66 
fchmod system call 
4.2BSD improvement fchmod, 
SYS 1-10 
fchown system call 
4.2BSD improvement, SYS 1-10 
fclose function 
description, PGM 1-7 
fentl system call 
4.2BSD improvement, SYS 1-10 
FCON operator (C compiler) 
defined, PGM 2-66 
fed font editor 
value of, SYS 1-6 
Feldman, S.I. 
EFL programming language, PGM 
2-123 to 2-157 
Make program, PGM 3-18 to 3-21 
Feldman, S.I., & Weinberger, P.J. 
Fortran 77 compiler, PGM 2-89 to 
2-109 
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feof macro 
breakpoints and, PGM 1-21 
ferror macro 
breakpoints and, PGM 1-21 
fflush function 
description, PGM 1-8 
fg command (C shell) 
defined, GEN 4-67 
running background job in 
foreground, GEN 4-47E 
running suspended job in 
foreground, GEN 4-47 
fgets function 
description, PGM 1-8 
fgrep 
hunt program and, GEN 5-148 
fi command (nroff/troff) 
defined, GEN 5-61 
Field (awk) 
description, PGM 3-8 
Field (nroff/troff) 
defined, GEN 5-66 
Figure 
specifying blank page for, GEN 
5-44 
specifying ruling for, GEN 5-45 
specifying space for, GEN 5-44 
FILE 
defined, PGM 1-21 
File 
See also File system 
See also specific files 
advisory locking and, SYS 1-3 
appending, GEN 3-48 
appending contents to mail, GEN 
2-24 
arranging, GEN 2-10 
CAI script for, GEN 6-7 
combining, GEN 2-10, 3-48, 3-49 
comparing, GEN 2-13 
copying, GEN 2-7H, 3-47 
copying from other directories, 
GEN 2-9 
creating, GEN 2-6 
defined, GEN 2-6, 3-3, PGM 4-10 
description, GEN 1-20 
displaying, GEN 2-10 
handling multiple, GEN 2-8 
I/O device and, GEN 1-21 
marking executable, GEN 2-12 
merging multiple, GEN 2-14 
open limit, PGM 1-11 
opening with edit, GEN 3-14 
optimal size, SYS 1-28 


File (Cont.) 
paging, GEN 2-7 
printing, GEN 2-7 
printing from other directories, 
GEN 2-9 
printing merged, GEN 2-11 
printing multiple, GEN 2-7, 2-8, 
2-11 
printing on high-speed printer, 
GEN 2-7 
programs executed by the shell 
and, GEN 1-27 
protection information, SYS 4-3 
recovering with edit, GEN 3-22 
removing, GEN 3-48 
removing multiple from directory, 
GEN 2-10E 
renaming, GEN 2-7 
replacing the terminal, GEN 2-10 
sending to several people, GEN 
2-11 
size of, GEN 1-238, 2-13 
splitting, GEN 2-13 
truncating to specific length, SYS 
1-4 
viewing in other directories, GEN 
2-9 
writing part of, GEN 3-49 
writing to disk, GEN 3-8 
File (C shell) 
See also specific files 
accessing from other directories, 
GEN 4-34 
directing input from, GEN 4-32E 
to 4-33E 
inputting to, GEN 4-31 
maintaining related, GEN 4-53 
outputting from, GEN 4-31 
redirecting terminal output to, 
GEN 4-31E 
terminating a command, GEN 
4-36E 
File (line printer system) 
reference list, PGM 4-99 
File (M4) 
manipulating, PGM 2-396 
File (vi) 
quitting, GEN 3-63 
recovering, GEN 3-66 
writing, GEN 3-63 
file command 
symbolic links and, SYS 1-6 
file command (edit) 
See f command (edit) 


file command (ex) 
See f command (ex) 
file command (Mail) 
See folder command (Mail) 
File descriptor 
changing assignments, GEN 1-28 
description, PGM 1-8 
File locking 
description, SYS 1-33 
File pointer 
defined, PGM 1-5 
File system 
accessing directories on old and 
new systems, SYS 1-33 
block size, SYS 2-8 
checking structural integrity, SYS 
2-10 
data structure, PGM 4-12F 
defined, PGM 4-10 to 4-13 
description, GEN 1-20 to 1-24 
fixing corrupted, SYS 2-10 to 2-13 
fragmentation of, SYS 2-9 
implementation, PGM 4-11 
implementing, GEN 1-24 to 1-26 
overview, SYS 2-8 to 2-9 
protecting, GEN 1-22 
removable volume and, GEN 1-22 
updating, SYS 2-9 
File system (4.2BSD) 
See also File system (Bell) 
allocating data blocks, SYS 1-30 
allocating directories, SYS 1-30 
allocating new blocks, SYS 1-29 
allocation strategy, SYS 1-30 
block size, SYS 1-26 
block size and wasted space, SYS 
1-27T 
compared to previous file system, 
SYS 1-23 to 1-38 
creating file versions, SYS 1-35 
fragments and, SYS 1-27 
free blocks and, SYS 1-28 
hardware parameters and, SYS 
1-28 to 1-29 
implementing layout, SYS 5-42 
layout policies, SYS 1-29 to 1-30 
locking files, SYS 1-33 
moving, SYS 5-54 
optimizing storage, SYS 1-26 
organization, SYS 1-26 to 1-30 
performance, SYS 1-381 to 1-32 
quotas and, SYS 2-4 
reading rates, SYS 1-31T 
restricting quota, SYS 1-35 
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File system (4.2BSD) (Cont.) 
selecting parameters, SYS 5-40 to 
5-41 
software engineering, SYS 1-36 
space overhead, SYS 1-28 
writing rates, SYS 1-31T 
File system (Bell) 
description, SYS 1-25 
File System Check Program 
See fsck program 
file.h file 
4.2BSD improvement, SYS 5-6 
Filelist file 
creating, GEN 2-10 
Filename 
4.2BSD changes, SYS 5-4 
arbitrary length and, SYS 1-3 
changing, GEN 3-47, 3-47W 
restriction, GEN 3-47 
conventions for, GEN 2-8 
description, GEN 1-21 
edit editor and, GEN 3-21 
folder name and, GEN 2-23 
maximum length, SYS 1-33 
renaming in same file system, 
SYS 1-4 
specifying, GEN 3-8 
suggestions, GEN 2-7 
Filename (C shell) 
base part and, GEN 4-63 
characters in, GEN 4-33 
defined, GEN 4-67 
Filename expansion 
defined, GEN 4-67 
FILENAME variable (awk) 
determining current input file, 
PGM 3-6 
files file 
4.2BSD improvement, SYS 5-11 
adding device driver and, SYS 
5-89 
files.vax file 
4.2BSD improvement, SYS 5-11 
Fill mode 
specifying, GEN 5-26 
Filling (nroff/troff) 
description, GEN 5-60 to 5-61 
filsys.h file 
See fs.h file 
Filter 
calling, PGM 4-103E 
creating for printers, PGM 4-102 
defined, GEN 4-4 
description, GEN 1-28 
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find 
finding symbolic links, SYS 1-6 
Find key 
defined, GEN 5-144 
First page 
entering in text file, GEN 5-5 
fl command (nroff/troff) 
defined, GEN 5-73 
Flag (C shell) 
purpose of, GEN 4-31 
Flag (ex) 
description, GEN 3-86 
Flag (Mail) 
reference list, GEN 2-41T 
Flag option (C shell) 
defined, GEN 4-67 
Flag option (Mail) 
defined, GEN 2-38 
flags field (config 
description, SYS 5-82 
Floating keep, GEN 5-26F 
defined, GEN 5-26 
flock system call 
4.2BSD improvement, SYS 1-10 
fmt command 
formatting outgoing mail, GEN 
2-26 
fo command (me) 
defined, GEN 5-41 
entering, GEN 5-23 
Foderaro, J.K., & others 
Franz Lisp Manual, The, PGM 
2-211 to 2-358 
Folder 
specifying for file, GEN 2-23 
folder command (Mail) 
See also folders command (Mail) 
description, GEN 2-30 
directing Mail to a folder, GEN 
2-23 
Folder directory 
specifying, GEN 2-23 
Folder facility 
description, GEN 2-23 
folder option (Mail) 
defined, GEN 2-34 
Folders 
maintaining, GEN 2-23 
folders command (Mail) 
See also folder command (Mail) 
description, GEN 2-30 
listing folder set, GEN 2-23 
Font 
changing, GEN 5-58, 5-86 


Font (Cont.) 
command list, GEN 5-51 
default, GEN 5-58 
defined, GEN 5-36 
description, GEN 5-36 to 5-37 
mixing within a line, GEN 5-86 
mixing within a word, GEN 5-37, 
5-86 
setting, GEN 5-39 
specifying, GEN 5-44, 5-85 
specifying for a word, GEN 5-36E 
specifying for more than one word, 
GEN 5-36 
style examples, GEN 5-78T 
switching, GEN 5-36 
Font library 
installing, SYS 5-31 
Footer 
See also Header 
formatting, GEN 5-41 to 5-42 
specifying, GEN 5-23 
Footnote 
See also Delayed text 
entering, GEN 5-8, 5-28, 5-43 
entering with a macro, GEN 
5-76E 
numbered automatically, GEN 
5-17 
resetting the numbering, GEN 
5-46 
separating footnotes, GEN 5-43 
specifying point size, GEN 5-8 
text formatting commands for, 
GEN 5-15E 
fopen function 
See also fclose function 
See also open function 
calling, PGM 1-5E 
description, PGM 1-5 
for loop 
description, GEN 4-7 
form, GEN 4-8E 
for statement (awk) 
defined, PGM 3-9 
for statement (BC) 
forming, GEN 2-54 
process, GEN 2-47 
writing, GEN 2-47 
For system call 
description, GEN 1-26 
foreach command (C shell), GEN 
4-56E 
defined, GEN 4-67 
exiting loop, GEN 4-58 


foreach command (C shell) (Cont.) 
performing similar commands, 
GEN 4-60E 
Foreground 
defined, GEN 4-67 
Foreground job 
continuing, GEN 4-46 
description, GEN 4-45 to 4-48 
suspending, GEN 4-46 
fork function 
description, PGM 1-14 
Form feed character 
printing, GEN 3-37 
Form letter 
using with nroff/troff, GEN 5-72 
format program 
4.2BSD improvement, SYS 1-18, 
1-19, 5-15 
formatting disks, SYS 5-22 to 
5-24 
loading, SYS 5-23 
Fortran 
See f77 I/O library 
See Fortran 77 
See Ratfor language 
Fortran 77 
C and, GEN 2-15 
running old programs, PGM 2-83 
Fortran 77 compiler 
4.2BSD improvement, SYS 1-4 
description, PGM 2-89 to 2-109 
Fortran I/O 
See also f77 I/O library 
constraints, PGM 2-80 to 2-82 
execution, PGM 2-80 
forms of, PGM 2-79 to 2-80 
general concepts, PGM 2-79 to 
2-80 
logical units and, PGM 2-80 
unit numbers and, PGM 2-80 
fortune game 
4.2BSD improvement, SYS 1-17 
Forward slash 
searching for, GEN 3-39 
fp command 
specifying fonts on the typesetter, 
GEN 5-86 
fp compiler/interpreter 
Functional Programming language 
and, SYS 1-6 
FP programming language 
description, PGM 2-359 to 2-391 
fpr program 
printing Fortran files, SYS 1-6 
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fprintf function 
description, PGM 1-7 
Fraction 
setting with troff, GEN 5-86E 
specifying with EQN, GEN 5-99 
Fragment size 
selecting, SYS 5-41 
frame.h file 
4.2BSD improvement, SYS 5-13 
Franz Lisp Manual, The, PGM 
2-211 to 2-358 
See also Franz Lisp system 
Franz Lisp system 
user manual, PGM 2-211 to 2-358 
from command (Mail) 
description, GEN 2-30 
message lists and, GEN 2-28 
from keyword (EQN), GEN 5-100E 
Front matter 
specifying, GEN 5-33 
fs 
4.2BSD improvement, SYS 1-16 
FS command (ms) 
specifying footnotes, GEN 5-8 
FS variable (awk) 
defined, PGM 3-6 
fs.h file 
4.2BSD improvement, SYS 5-5 
fscanf function 
See also sscanf function 
description, PGM 1-8 
fsck program 
See also badsect program 
4.2BSD improvement, SYS 1-19 
checking connectivity, SYS 2-12 
checking directory data blocks, 
SYS 2-12 
checking free blocks, SYS 2-10 
checking inode block count, SYS 
2-12 
checking inode links, SYS 2-11 
checking inode state, SYS 2-11 
checking super-block, SYS 2-10 
description, SYS 2-7 to 2-25 
error conditions, SYS 2-14 to 2-25 
rebuilding block allocation maps, 
SYS 2-11 
fsplit program 
splitting multi-function Fortran 
files, SYS 1-6 
fstab library 
4.2BSD improvement, SYS 1-15 
fstat system call 
4.2BSD improvement, SYS 1-11 
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fsync system call 

4.2BSD improvement, SYS 1-11 
ft command (troff) 

defined, GEN 5-59 

specifying fonts, GEN 5-86 
FTP server 

description, SYS 5-50 
ftp server program 

ARPA file transfer protocol and, 

SYS 1-6 

ftpd server program 

4.2BSD improvement, SYS 1-19 
ftpusers file 

description, SYS 5-50 
ftruncate system call 

4.2BSD improvement, SYS 1-11 
Function (BC) 

description, GEN 2-45 to 2-46 

number permitted, GEN 2-45 
Function call 

defined, GEN 2-51 
Function identifier 

description, GEN 2-50 
fz command (nroff/troff) 

specifying font size, GEN 5-81 


G 


g command (ed) 
defined, GEN 3-34 
process, GEN 3-46 
s command and, GEN 3-46E 
s command restriction and, GEN 
3-47 
specifying line numbers, GEN 
3-47 
specifying lines with text patterns, 
GEN 3-46 to 3-47 
specifying more than one 
command, GEN 3-47 
using, GEN 3-32 
g command (edit) 
description, GEN 3-19 
p command and, GEN 3-19 
substitute command and, GEN 
3-19 
uppercase letters and, GEN 3-19 
using, GEN 3-19E 
g command (ex) 
description, GEN 3-89 
G command (sed) 
defined, GEN 3-113 
g command (sed) 
defined, GEN 3-113 


G command (vi) 

defined, GEN 3-79 

finding text lines, GEN 3-57 
g flag (sed) 

defined, GEN 3-110 
g option (hunt) 

defined, GEN 5-148 
g option (troff) 

defined, GEN 5-50 
g option (uucp) 

defined, SYS 5-132 
gcore program 

creating a core dump of running 

process, SYS 1-6 

genassym.c file 

4.2BSD improvement, SYS 5-14 
getc macro 

defined, PGM 1-6 
getch routine 

defined, PGM 4-84 
getchar macro 

input and, PGM 1-4 
getdtablesize system call 

4.2BSD improvement, SYS 1-11 
getgroups system call 

4.2BSD improvement, SYS 1-11 
gethostbynameandnet routine, SYS 

3-13E 

gethostid system call 

4.2BSD improvement, SYS 1-11 
gethostname system call 

4.2BSD improvement, SYS 1-11 
getitimer system call 

4,.2BSD improvement, SYS 1-11 
getpagesize system call 

4,.2BSD improvement, SYS 1-11 
getpass library 

4.2BSD improvement, SYS 1-14 
getpriority system call 

4.2BSD improvement, SYS 1-11 
getrlimit system call 

4.2BSD improvement, SYS 1-11 
getservbyname routine 

specifying a protocl, SYS 3-14 
getsockopt system call 

4.2BSD improvement, SYS 1-11 
getstr routine 

defined, PGM 4-84 
gettable program 

4.2BSD improvement, SYS 1-19 

retrieving NIC host data base, 

SYS 5-48 

gettimeofday system call 

4.2BSD improvement, SYS 1-11 


gettimeofday system call (Cont.) 
specifying value, SYS 5-74 
gettmode routine 
defined, PGM 4-88 
variables set by, PGM 4-90T 
getty program 
See also gettytab file 
4.2BSD improvement, SYS 1-18, 
1-19 
gettytab file 
4.2BSD improvement, SYS 1-16 
getwd library 
4.2BSD improvement, SYS 1-15 
getyx routine 
defined, PGM 4-85 
GID 
description, SYS 4-4 
global command (ed) 
See g command (ed) 
See v command (ed) 
global command (edit) 
See g command (edit) 
global command (ex) 
See g command (ex) 
globl statement (as) 


defined 
go flag 
accessing sdb symbol information, 
SYS 1-5 


goto command (C shell) 
defined, GEN 4-67 
form of, GEN 4-58E 
gprof command 
profiled systems and, SYS 5-78 
gprof program 
See also gprof.h file 
displaying execution time, SYS 
1-6 
gprof.h file 
4.2BSD improvement, SYS 5-5 
Graham, S.L., & others 
Berkeley Pascal User Manual, 
PGM 2-159 to 2-209 
Grave accent 
See Metacharacters 
Greek letters 
setting with -ms, GEN 5-10 
setting with troff, GEN 5-86E 
troff command list, GEN 5-96 
grep command (C shell) 
defined, GEN 4-67 
grep program 
finding lines with combinations of 
text patterns, GEN 3-51 
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grep program (Cont.) 


finding lines without specified text, 


GEN 3-51E 
finding specified text in a set of 
files, GEN 3-51, 3-51E 
nonalphabetic characters and, 
GEN 3-51 
spell and, GEN 2-13 
using, GEN 2-13E 
Grep program 
searching for text patterns, GEN 
2-13 
Group Identification Number 
See GID 
Group set 
description, SYS 1-3 
grouping command (sed) 
defined, GEN 3-113 
groups program 
display access list for user’s group, 
SYS 1-6 


H 


H command (sed) 
defined, GEN 3-113 
h command (sed) 
defined, GEN 3-113 
h command (troff) 
moving text backwards on a line, 
GEN 5-87 
specifying horizontal motion, GEN 
5-68 
H command (vi) 
defined, GEN 3-79 
h escape (Mail) 
description, GEN 2-25 
h flag (Mail) 
defined, GEN 2-36 
H macro (me) 
specifying column heads on 
continuing pages, GEN 5-42 
h macro (me) 
defined, GEN 5-42 
h option (inv) 
defined, GEN 5-147 
h option (nroff) 
- defined, GEN 5-81 
Haley, C.B., & others 
Berkeley Pascal User Manual, 
PGM 2-159 to 2-209 
hangman game 
4.2BSD improvement, SYS 1-17 
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Hard limit 
defined, SYS 2-3 
Hard lock 
compared to advisory lock, SYS 
1-33 
Hardcopy terminal 
vi and, GEN 3-73 
hardtabs option (ex) 
description, GEN 3-98 
Hash character 
See Sharp character 
Hat 
See Circumflex character (ed) 
he command (nroff/troff) 
defined, GEN 5-69 
he command (me) 
defined, GEN 5-41 
entering, GEN 5-23 
head command (C shell) 
defined, GEN 4-68 
Header 
See also Footer 
formatting, GEN 5-41 to 5-42 
specifying, GEN 5-23 
suppressing, GEN 2-36 
Header field 
defined, GEN 2-38 
headers command (Mail) 
See also ignore command (Mail) 
abbreviating, GEN 2-30 
description, GEN 2-30 
help command (Mail) 
description, GEN 2-30 
restriction, GEN 2-30 
using, GEN 2-22 
Henry, R.R., & Reiser, J.F. 
Berkeley VAX/UNIX Assembler 
Reference Manual, PGM 4-53 
to 4-65 
Here document 
description, GEN 4-9 to 4-10 
Hexadecimal notation 
BC language and, GEN 2-44 
hier 
4,.2BSD improvement, SYS 1-17 
history command (C shell) 
defined, GEN 4-68 
repeating previous commands, 
GEN 4-48 
History list 
description, GEN 4-41 to 4-43 
using, GEN 4-42E 
hl command (me) 
defined, GEN 5-45 


hl command (me) (Cont.) 
figures and, GEN 5-26. 
hold command (Mail) 
See also preserve command (Mail) 
description, GEN 2-31 
hold option (Mail) 
defined, GEN 2-34 
storing mail, GEN 2-20 
Home directory 
defined, GEN 4-68 
returning to, GEN 4-49 
HOME variable (Bourne shell) 
description, GEN 4-11 
home variable (C shell) 
displaying your home directory, 
GEN 4-41 
Horizonal line 
See Ruling 
Horton, M., & Joy, W. 
editing with vi, GEN 3-538 to 3-82 
Ex Reference Manual, GEN 3-83 
to 38-104 
Host name 
represented by hostent structure, 
SYS 3-12E 
Hostent structure 
getting for host, SYS 3-13E 
hostid program 
displaying system unique 
identifier, SYS 1-6 
hostname program 
setting host name, SYS 1-6 
hosts database 
4.2BSD improvement, SYS 1-16 
hosts.equiv file 
description, SYS 5-49 
hp.c device driver 
4.2BSD improvement, SYS 5-14 
htable program . 
converting NIC host data base, 
SYS 5-48 
hunt program 
defined, GEN 5-146 
description, GEN 5-148 
fgrep and, GEN 5-148 
options list, GEN 5-148 
timing, GEN 5-149 
hw command (nroff/troff) 
defined, GEN 5-69 
hx command (me) 
defined, GEN 5-41 
hy command (nroff/troff) 
defined, GEN 5-69 


hy network interface driver 
4.2BSD improvement, SYS 1-16 
Hyphen 
entering with text, GEN 5-22 


_ Hyphenation (nroff/troff) 


automatic, GEN 5-69 
command list, GEN 5-52 
Hyphenation indicator character 
specifying, GEN 5-69 
HZ parameter 
description, SYS 5-122 


I 


i command (DC) 
changing the base of input 
numbers, GEN 2-62 
description, GEN 2-59 
i command (ed) 
defined, GEN 3-34 
using, GEN 3-31 to 3-32 
i command (ex) 
description, GEN 3-89 
i command (me) 
defined, GEN 5-44 
specifying italic font, GEN 5-36 
I command (ms) 
specifying italic, GEN 5-8 
i command (sed) 
See also a command (sed) 
defined, GEN 3-109 
I command (vi) 
defined, GEN 3-79 
i command (vi) 
defined, GEN 3-81 
description, GEN 3-58 
i flag (Mail) 
See also ignore option 
defined, GEN 2-36 
i flag (make) 
defined, PGM 3-17 
i flag (mkey) 
ignoring lines, GEN 5-147 
I option 
changed to -i, SYS 1-6 
i option 
specifying directory search paths, 
SYS 1-6 
i option (hunt) 
defined, GEN 5-148 
i option (inv) 
defined, GEN 5-148 
i option (nroff/troff) 
defined, GEN 5-49 
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i-list 
description, GEN 1-24 
i-node 
defined, PGM 4-10 
file description and, GEN 1-24 
i-number 
defined, GEN 1-24 
1/o 
essentials of, GEN 1-23 to 1-24 
I/O request 
multiplexing among sockets and 
files, SYS 3-11 
I/O system 
description, PGM 4-8 to 4-10 
overview, PGM 4-67 to 4-73 
ibase 
defined, GEN 2-44, 2-51 
icheck program 
4.2BSD improvement, SYS 1-19 
ident parameter (config) 
defined, SYS 5-79 
Identifier 
defined, GEN 2-51 
kinds of, GEN 2-50 
Identifier (as) 
defined, GEN 6-53 
ie command (nroff/troff) 
defined, GEN 5-71 
if command (Bourne shell) 
description, GEN 4-13 to 4-14 
if command (C shell) 
See if/endif commands (C shell) 
if command (Mail) 
See if/endif commands (Mail) 
if command (nroff/troff) 
defined, GEN 5-71 
if/endif commands (C shell) 
See also else command (C shell) 
See also then command (C shell) 
defined, GEN 4-66, 4-68 
forms of, GEN 4-56 to 4-57 
if/endif commands (Mail) 
description, GEN 2-31 
restriction, GEN 2-31 
if/endif commands (nroff/troff) 
description, GEN 5-93 to 5-94 
reference list, GEN 5-52 
if/endif statement (as) 
defined, GEN 6-59 
if statement (as) 
See if/endif statement (as) 
if statement (awk) 
defined, PGM 3-9 
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if statement (BC) 
forming, GEN 2-54 
restriction, GEN 2-47 
writing, GEN 2-47 
ifdef command (M4) 
description, PGM 2-395 
ifelse command (M4) 
description, PGM 2-397 
IFS variable 
defined, GEN 4-12 
ig command (nroff/troff) 
defined, GEN 5-73 
ignore command (Mail) 
description, GEN 2-31 
ignore option (Mail) 
See also i flag (Mail) 
defined, GEN 2-34 
ignorecase option (ex) 
description, GEN 3-98 
ignoreeof variable (C shell) 
defined, GEN 4-68 
setting, GEN 4-41E 
ignoreof option (Mail) 
See also dot option 
defined, GEN 2-34 
ik driver 
4.2BSD improvement, SYS 1-16 
ik.c device driver 
4.2BSD improvement, SYS 5-12 
Ikonas frame buffer graphics device 
interface 
See ik driver 
Ikonas frame buffer graphics 
interface 
See ik.c device driver 
il network interface driver 
4.2BSD improvement, SYS 1-16 
Image 
defined, GEN 1-26 
imp network interface driver 
4.2BSD improvement, SYS 1-16 
IMP-11A LH/DH IMP interface 
See css network driver 
in command (me) 
See also ix command (me) 
entering, GEN 5-24 
in command (nroff/troff) 
defined, GEN 5-62 
in__cksum.c file 
4.2BSD improvement, SYS 5-13 
include command (M4) 
description, PGM 2-396 
incr command (M4) 
description, PGM 2-395 


indent program 
formatting C program source, SYS 
1-6 
Indention 
command list, GEN 5-51 
resetting base, GEN 5-45 
specifying, GEN 5-24 
specifyng with nroff/troff, GEN 
5-62 
Index 
See Table of contents 
index command (M4) 
description, PGM 2-397 
Index entry 
specifying, GEN 5-43 
Indexing 
description, GEN 5-143 to 5-155 
Indirect block 
inode and, SYS 2-8 
init program 
4.2BSD improvement, SYS 1-19 
description, GEN 1-30 
init_main.c file 
contents, SYS 5-8 
init__sysent.c file 
contents, SYS 5-8 
initscr routine 
defined, PGM 4-86 
inode 
allocations states, SYS 2-11 
defined, SYS 2-8 
disk space and, SYS 2-8 
types of, SYS 2-11 
Inode table 
setting size, SYS 5-121 
inode.h file 
4.2BSD improvement, SYS 5-6 
input 
defined, GEN 4-68 
Input base 
DC. and, GEN 2-62 
Input mode 
description, GEN 3-7 
Input/output 
See I/O 
insch routine 
defined, PGM 4-82 
Insert command (ed) 
See i command (ed) 
insert command (ex) 
See i command (ex) 
insert command (vi) 
See i command (vi) 


insertin routine 
defined, PGM 4-82 
install command, SYS 5-55E 
install script 
installing software, SYS 1-6 
int function (awk) 
defined, PGM 3-8 
Interlan Ethernet interface 
See il network interface driver 
Intermediate language (C compiler) 
description, PGM 2-63 to 2-66 
Internet address 
binding, SYS 3-24 to 3-26 
binding in Internet domain, SYS 
3-8E 
binding with wildcard address, 
SYS 3-25E 
Internet port 
printing, SYS 3-16E 
Interprocess communication 
description, SYS 3-5 to 3-28 
transferring data, SYS 3-9E 
Interprocess communication 
facilities 
4.2BSD improvement, SYS 1-3 
Interrupt message 
description, GEN 3-9 
Interrupt signal 
See also oninvr command (C 
shell) 
See also stty command (C shell) 
creating, GEN 1-31 
defined, GEN 4-68 
ignoring, GEN 2-36 
scripts and, GEN 4-59 
intro system call 
4.2BSD improvement, SYS 1-10 
inv program 
defined, GEN 5-146 
description, GEN 5-147 
options list, GEN 5-147 
Inverted indexes 
See Indexing 
I/O library 
restriction, GEN 2-15 
ioctl system call 
4.2BSD improvement, SYS 1-11 
ioctl.h file 
4.2BSD improvement, SYS 5-6 
iostat 
reporting kilobytes per second 
transferred for each disk, SYS 
1-6 
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ip command (me) 

See also np command 

defined, GEN 5-40 

specifying with label, GEN 5-30 
IP command (ms) 

indenting paragraphs, GEN 5-7 

references and, GEN 5-7E 
isprint library 

4.2BSD improvement, SYS 1-14 
it command (nroff/troff) 

defined, GEN 5-65 
Italic 

See also Underlining 

bolding, GEN 5-44 

specifying, GEN 5-8 

troff and, GEN 5-66 
ix command (me) 

defined, GEN 5-44 


J 


j command (ed) 
joining lines, GEN 3-42, 3-48E 
j command (ex) 
description, GEN 3-90 
J command (vi) 
defined, GEN 3-79 
j number register (nroff/troff) 
defined, GEN 5-81 
Job 
defined, GEN 4-45, 4-69 
determining current job, GEN 
4-46 
suspending, GEN 4-46 
Job control command 
See also bg command (C shell) 
See also fg command (C shell) 
See also kill command (C shell) 
See also stop command (C shell) 
defined, GEN 4-69 
Job name 
beginning character, GEN 4-46 
Job number 
defined, GEN 4-69 
description, GEN 4-45 
jobs command (C shell) 
defined, GEN 4-69 
displaying jobs, GEN 4-47E 
Johnson, S.C. 
Lint command, PGM 3-39 to 3-50 
tour through portable C compiler, 
PGM 2-37 to 2-61 
Yacc, PGM 3-79 to 3-111 
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join command (ex) 
See j command (ex) 
Joy, W. 
C shell introduction, GEN 4-29 to 
4-74 
Joy, W., & Horton, M. 
editing with vi, GEN 3-53 to 3-82 
Ex Reference Manual, GEN 3-83 
to 3-104 
Joy, W., & Leffler, S.J. 
4.2BSD on VAX/VMS, SYS 5-17 
to 5-71 
Joy, W., & others 
4,2BSD Interprocess 
Communication Primer, SYS 
3-5 to 3-28 
4.2BSD System Manual, PGM 
4-15 to 4-52 
Berkeley Pascal User Manual, 
PGM 2-159 to 2-209 
fast file system, SYS 1-23 to 1-38 
networking implementation notes, 
SYS 3-29 to 3-57 
Joyce, J., & Blau, R. 
Edit tutorial, GEN 3-8 to 3-23 
Justifying (nroff/troff) 
command list, GEN 5-51 
description, GEN 5-60 to 5-61 


K 


k command (DC) 
description, GEN 2-59 
scale value and, GEN 2-60 
k command (ed) 
marking a line, GEN 3-50E 
k command (ex) 
See also mark command (ex) 
description, GEN 3-90 
k escape sequence (nroff/troff) 
description, GEN 5-68 
k flag (mkey) 
specifying number of keys, GEN 
5-147 
k number register (nroff/troff) 
defined, GEN 5-81 
Keep 
See also Floating keep 
defined, GEN 5-26 
footnotes and, GEN 5-35 to 5-36 
index entries and, GEN 5-35 to 
5-36 
text formatting commands for, 
GEN 5-15E 


keep option (Mail) 
defined, GEN 2-34 
keepsave option (Mail) 
See also nosave option 
defined, GEN 2-35 
kern__acct.c file 
contents, SYS 5-8 
kern__clock.c file 
4.2BSD improvement, SYS 5-8 
kern__descrip.c file 
contents, SYS 5-8 
kern__exec.c file 
contents, SYS 5-8 
kern__exit.c file 
contents, SYS 5-8 
kern_fork.c file 
contents, SYS 5-8 
kern__mman.c file 
contents, SYS 5-8 
kern_proc.c file 
contents, SYS 5-8 
kern__prot.c file 
contents, SYS 5-8 
kern__resource.c file 
contents, SYS 5-8 
kern_sign.c file 
contents, SYS 5-8 
kern_subr.c file 
contents, SYS 5-8 
kern__synch.c file 
contents, SYS 5-8 
kern__time.c file 
contents, SYS 5-8 
kern_xxx.c file 
contents, SYS 5-8 
Kernel 
4.2BSD improvement, SYS 5-3 to 
5-15 
configuration, SYS 5-36 to 5-37 
implementation, PGM 4-5 to 4-8 
implementing devices, SYS 5-37 
kernel.h file 
4.2BSD improvement, SYS 5-5 
Kernighan, B.W. 
advanced editing with ed, GEN 
3-37 to 3-52 
introduction to ed, GEN 3-25 to 
3-35 
Ratfor language, PGM 2-111 to 
2-122 
troff tutorial, GEN 5-83 to 5-96 
UNIX for beginners, GEN 2-3 to 
2-16 


Kernighan, B.W., & Cherry, L.L. 
typesetting mathematics, GEN 
5-97 to 5-104 
Typesetting Mathematics - User’s 
Guide, GEN 5-105 to 5-114 
Kernighan, B.W., & Lesk, M.E. 
computer-naided instruction for 
UNIX, GEN 6-3 to 6-16 
Kernighan, B.W., & others 
awk programming language, PGM 
3-5 to 3-12 
Kernighan, B.W., & Ritchie, D.M. 
M4 macro processor, PGM 2-393 
to 2-398 
programming UNIX, PGM 1-3 to 
1-24 
Kessler, P.B., & others 
Berkeley Pascal User Manual, 
PGM 2-159 to 2-209 
Key 
defined, GEN 5-147 
selected by program, GEN 5-145 
Key file 
defined, GEN 5-145 
Key letters 
reference list, GEN 5-152 
Key-making program 
format used, GEN 5-145 
Keyword 
supplementing, GEN 5-150 
Keyword (BC) 
reserved 
reference list, GEN 2-50 
Keyword parameter 
description, GEN 4-17 to 4-25 
Keyword statement (as) 
defined, GEN 6-56 
reference list, GEN 6-59 to 6-60 
KF command (ms) 
moving blocks of text, GEN 5-9 
kg driver 
4.2BSD improvement, SYS 1-16 
kgclock.c device driver 
4.2BSD improvement, SYS 5-12 
kgmon program 
See also gmon.out file 
4.2BSD improvement, SYS 1-19 
Kill character 
default, GEN 4-30 
kill command (C shell) 
background commands and, GEN 
4-37 
background jobs and, GEN 4-47E 
defined, GEN 4-69 
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kill command (C shell) (Cont.) 
killing processes, GEN 2-11 
suspended jobs and, GEN 4-47 
killpg library routine 
See killpg system call 
killpg system call 
4.2BSD improvement, SYS 1-11 
KL-11 
See kg driver 


Kowalski, T.J., & McKusick, M.K. 


fsck, SYS 2-7 to 2-25 
KS command (ms) 
keeping text blocks together, GEN 
5-9, 5-94E 


L 


L argument (nroff) 
centering and, GEN 5-27 
specifying, GEN 5-27 
1 command (DC) 
programming DC, GEN 2-62 
command (ed) 
backspaces and, GEN 3-37 
description, GEN 3-37 
long lines and, GEN 3-37 
-p command and, GEN 3-37 
tabs and, GEN 3-37 
command (me) — 
centering list elements, GEN 5-27 
defined, GEN 5-42 
entering, GEN 5-25 
specifying fill mode, GEN 5-26 
specifying left justification, GEN 
5-27 
L command (vi) 
defined, GEN 3-79 
1 flag (mkey) 
specifying items to be ignored, 
GEN 5-147 
L number register (nroff/troff) 
defined, GEN 5-81 
1 option (C shell) 
description, GEN 2-6 
1 option (hunt) 
defined, GEN 5-148 
L-devices file 
defined, SYS 5-139 
L-dialcodes file 
defined, SYS 5-139 
L.sys file 
contents, SYS 5-135 
defined, SYS 5-141 
ownership of, SYS 5-138 


i 


— 
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_ Label (as) 


See Name label; Numeric label 
label command (sed) 
defined, GEN 3-114 
LABEL operator (C compiler) 
defined, PGM 2-65 
last 
displaying remote host, SYS 1-6 
lastcomm 
indicating program activity, SYS 
1-7 
Layer, K., & others 
Franz Lisp Manual, The, PGM 
2-211 to 2-358 
lec command (nroff/troff) 
defined, GEN 5-66 
LCK file 
description, SYS 5-143 
Leader character (nroff/troff) 
setting, GEN 5-66 
uninterpreted, GEN 5-66 
Leadering 
specifying with troff, GEN 5-88 
Leading 
See Vertical spacing 
LEARN driver program 
defined, GEN 6-3 
description, GEN 2-6 
directory structure, GEN 6-8 
experience with students, GEN 
6-8 
introduction to UNIX, GEN 6-3 
to 6-16 
sequence of events, GEN 6-9 
vi and, SYS 1-7 
leaveok routine 
defined, PGM 4-86 
Leffler, S.J. 
building 4.2BSD systems with 
config, SYS 5-73 to 5-105 
improvements in 4.2BSD, SYS 
1-3 to 1-21 
kernel and 4.2BSD, SYS 5-3 to 
5-15 
Leffler, S.J., & Joy, W.N. 
4.2BSD on VAX/VMS, SYS 5-17 
to 5-71 
Leffler, S.J., & others 
4.2BSD Interprocess 
Communication Primer, SYS 
3-5 to 3-28 
4,.2BSD System Manual, PGM 
4-15 to 4-52 
fast file system, SYS 1-23 to 1-38 


Leffler, S.J., & others (Cont.) 
networking implementation notes, 
SYS 3-29 to 3-57 
left keyword (EQN), GEN 5-100E 
len command (M4) 
description, PGM 2-397 
length function (awk) 
defined, PGM 3-8 
Leres, C., & Shoens, K. 
Mail Reference Manual, GEN 
2-17 to 2-41 
Lesk, M.E. 
formatting tables, GEN 5-115 to 
5-131 
inverted indexes, GEN 5-143 to 
5-155 
preparing documents with -ms, 
GEN 5-13 to 5-16 
updating publication lists, GEN 
5-155 to 5-162 
using -ms macros with troff and 
nroff, GEN 5-5 to 5-12 
Lesk, M.E., & Kernighan, B.W. 
computer-aided instruction for 
UNIX, GEN 6-3 to 6-16 
Lesk, M.E., & Nowitz, D.A. 
a dial-up network of UNIX 
systems, SYS 5-123 to 5-129 
Lesk, M.E., & Schmidt, E. 
Lex program generator, PGM 
3-113 to 3-125 
Lex program generator 
description, PGM 3-113 to 3-125 
LG command (ms) 
increasing type size, GEN 5-8 
lg command (troff) 
defined, GEN 5-66 
libe.a library 
remaking, SYS 5-120 
libI77.a library 
See £77 I/O library 
Life game 
program for, PGM 4-94E 
Ligature (troff) 
types available, GEN 5-66 
limit command (C shell) 
displaying current limitations, 
GEN 4-51E 
setting limits, GEN 4-51E 
Line 
See Line drawing (nroff/troff) 
Line dot 
See Dot character (ed) 


Line drawing (nroff/troff) 
description, GEN 5-68 
Line length (nroff/troff) 
specifying, GEN 5-62, 5-86 
Line printer 
setting for serial lines, PGM 4-101 
setting remote, PGM 4-101 
Line printer control program 
See lpc program 
Line Printer Dameon 
See lpd program 
Line Printer Queue program 
See lpq program 
Line printer spooling system 
devices supported, PGM 4-99, 
SYS 5-44 
file list, SYS 5-44 
setting up, SYS 5-44 
Line printer spooling system 
(4.2BSD) 
See also lpc program; pac program 
4.2BSD improvement, SYS 1-4, 
1-7, 1-18 
controlling access, PGM 4-100 to 
4-101 . 
error messages, PGM 4-103 to 
4-105 
filters and, PGM 4-102 
setting up, PGM 4-101 to 4-102 
user manual, PGM 4-99 to 4-105 
Line spacing 
See Vertical spacing 
Linking 
description, GEN 1-21 
Lint command 
checking C programs, PGM 3-39 
to 3-50 
lint command 
C and, GEN 2-15 . 
creating libraries from C source 
code, SYS 1-7 
LINT configuration file 
using, SYS 5-88E 
LINT file 
4.2BSD improvement, SYS 5-11 
LINTRUP request 
See fentl system call 
lisp option (ex) 
description, GEN 3-99 
lisp option (vi) 
setting, GEN 3-68 
Lisp program 
See also vlp program 
4.2BSD improvement, SYS 1-7 
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Lisp program (Cont.) 
editing with vi, GEN 3-68 
List 
defined, GEN 5-25 
specifying in text, GEN 5-25 
text formatting commands for, 
GEN 5-15E 
text formatting commands for 
nested, GEN 5-15E 
list command 
See ls command (C shell) 
List command (ed) 
See 1 command (ed) 
list command (ex) 
description, GEN 3-90 
list command (Mail) 
description, GEN 2-31 
list files command 
See ls command (C shell) 
list option (ex) 
description, GEN 3-99 
listen system call 
4,.2BSD improvement, SYS 1-11 
incoming requests and, SYS 3-9E 
ll command (me) 
See also xl command (me) 
defined, GEN 5-45 
ll command (nroff/troff) 
defined, GEN 5-62 
resetting line length, GEN 5-86E 
In 
creating symbolic links, SYS 1-7 
lo command (me) 
defined, GEN 5-45 
lo network interface 
4.2BSD improvement, SYS 1-16 
load command (DC) 
See 1 command (DC) 
local command (Mail) 
description, GEN 2-31 
Local motion 
defined, GEN 5-67 
Location counter (as) 
See also bss segment 
defined, GEN 6-55 
Locore.c file 
4.2BSD improvement, SYS 5-13 
locore.s file 
4.2BSD improvement, SYS 5-14 
installing device drive and, SYS 
5-119 
LOG file 
description, SYS 5-142 
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log function (awk) 
defined, PGM 3-8 
Logging in 
description, GEN 2-3 to 2-4 
prerequisites, GEN 2-3 
procedure, GEN 3-5 
recording attempts, SYS 4-12 
Logging out, GEN 3-8E 
description, GEN 2-5 
Login directory 
startup file and, GEN 2-12 
login file 
See also logout file 
background jobs and, GEN 4-48E 
defined, GEN 4-69 
logging in and, GEN 4-39, 4-39E 
rlogin server and, SYS 1-7 
telnetd server program and, SYS 
1-7 
Login shell 
See also Script file 
defined, GEN 4-69 
logging in and, GEN 4-39 
logout command 
exiting from UNIX, GEN 3-8 
logout command (C shell) 
defined, GEN 4-69 
logout file 
See also login file 
C shell and, GEN 4-39 
defined, GEN 4-69 
London, T.B., & Reiser, J.F. 
regenerating system software, SYS 
5-117 to 5-122 
setting up UNIX/32V V1.0, SYS 
5-107 to 5-115 
longjmp library 
old semantics and, SYS 1-15 
longjump library 
4.2BSD improvement, SYS 1-15 
longname routine 
defined, PGM 4-86 
lookbib command 
checking the data base, GEN 
5-150 
Loop 
variables and, GEN 4-60 
Low-level I/O 
description, PGM 1-8 to 1-12 
Ip command (me) 
defined, GEN 5-40 
entering, GEN 5-29 


LP command (ms) 
specifying block paragraphs, GEN 
5-5 
lp.c device driver 
4.2BSD improvement, SYS 5-12 
Ipe program 
4.2BSD improvement, SYS 1-4, 
1-18, 1-19 
description, PGM 4-100 
Ipd program 
description, PGM 4-99 
requests understood 
reference list, PGM 4-100 
Ipd server program 
4.2BSD improvement, SYS 1-20 
Ipq program 
4,.2BSD improvement, SYS 1-7 
description, PGM 4-100 
lpr command (C shell) 
defined, GEN 4-69 
Ipr program 
Ipd and, PGM 4-100 
lprm program 
4,.2BSD improvement 
description, PGM 4-100 
lq command (me) 
specifying quotation marks, GEN 
5-38 
ls command (C shell) 
4.2 BSD improvement, SYS 1-7 
defined, GEN 4-69 
description, GEN 2-6 
listing files in three columns, 
GEN 2-11 
specifying numeric sort, GEN 
4-32E 
Ils command (Mail) 
displaying files on your terminal, 
GEN 2-10 
ls command (me) 
entering, GEN 5-23 
ls command (nroff/troff) 
defined, GEN 5-61 
lIseek system call 
4,2BSD improvement, SYS 1-11 
description, PGM 1-11 
It command (nroff/troff) 
defined, GEN 5-70 


M 


m command (e) 
reversing two adjacent lines, GEN 
3-50E 


m command (ed) 
caution, GEN 3-50 
defined, GEN 3-34 
moving text, GEN 3-50E 
using, GEN 3-32 
m command (edit) 
context search and, GEN 3-15 
moving text, GEN 3-14 
m command (ex) 
description, GEN 3-90 
M command (vi) 
defined, GEN 3-79 
m command (vi) 
defined, GEN 3-81 
m escape (Mail) 
description, GEN 2-25 
m option (nroff/troff) 
defined, GEN 5-49 
m option (uuclean) 
defined, SYS 5-137 
m option (uucp) 
defined, SYS 5-132 
ml command (me) 
defined, GEN 5-41 
m2 command (me) 
defined, GEN 5-41 
m3 command (me) 
defined, GEN 5-42 
m4 command (me) 
defined, GEN 5-42 
M4 macro processor 
arguments, PGM 2-395 
arithmetic built-ins, PGM 2-395 
command line format, PGM 2-393 
conditionals, PGM 2-397 
defining macros, PGM 2-393 to 
2-395 
description, PGM 2-393 to 2-398 
manipulating files, PGM 2-396 
manipulating strings, PGM 2-397 
operation, PGM 2-393 
printing, PGM 2-397 
m4 macro processor 
4,2BSD improvement, SYS 1-7 
machdep.c file 
4,2BSD improvement, SYS 5-14 
machine file 
4,2BSD improvement, SYS 5-4 
Machine instruction statement (as) 
syntax, GEN 6-60 to 6-63 
machine type parameter (config) 
defined, SYS 5-79 
Macro (M4) 
defining, PGM 2-393 to 2-395 
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Macro (nroff) 
defined, GEN 5-35 
defining, GEN 5-35E 
naming, GEN 5-35 
using, GEN 5-35E 
Macro (nroff/troff) 
arguments, GEN 5-63 
defined, GEN 5-62 
description, GEN 5-62 to 5-65 
diversions, GEN 5-63 
printing, GEN 5-73 
traps, GEN 5-64 
Macro (troff) 
arguments and, GEN 5-92 to 5-93 
arguments and blanks, GEN 5-93 
arguments and trailing 
punctuation, GEN 5-92 
Macro (vi) 
See also Word abbreviation 
types of, GEN 3-68 
Macro definition (make), PGM 
3-15E 
defined, PGM 3-15 
Macro-invocation trap (nroff/troff) 
description, GEN 5-64 
magic option (ex) 
description, GEN 3-96 
magic option (ex) 
description, GEN 3-99 
Magnetic tape 
FORTRAN-77 and, PGM 2-84 
Mail 
adding to mail list, GEN 2-25 
answering, GEN 2-19 to 2-20 
C shell watching for, GEN 4-39E 
canceling, GEN 2-18 
changing the subject line, GEN 
2-25 
commands to be executed by the 
shell, GEN 2-28 
defined, GEN 2-38 
deleting, GEN 2-20 
description, GEN 2-5 
filing, GEN 2-24 
format, GEN 2-37 
forwarding, GEN 2-25 
holding in system mailbox, GEN 
2-31 
including in other mail, GEN 2-25 
indicating indirect recipients, 
GEN 2-25 . 
keeping, GEN 2-35 
keeping outgoing, GEN 2-35 
length restricted, GEN 2-37 
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Mail (Cont.) 
line width, GEN 2-37 
maintaining groups of mail, GEN 
2-23 
message lists and user names, 
GEN 2-28 
notification of, GEN 2-17 
paging, GEN 2-20 
process, GEN 2-17 
protecting, GEN 2-34E 
reading, GEN 2-18 to 2-19 
reading in home directory, GEN 
2-21 
reading next, GEN 2-19 
reading other people’s, GEN 2-36 
recovering deleted, GEN 2-30 
saving related in a file, GEN 2-32 
searching for subjects, GEN 2-28 
sending, GEN 2-18 
sending multiple messages, GEN 
2-28 
sending remote, SYS 5-126 
sending source program text, GEN 
2-33 
sending to file, GEN 2-27 
sending to folder, GEN 2-27 
sending to list, GEN 2-21 
sending to multiple users, GEN 
2-18 
sending to other machines, GEN 
2-26 to 2-27 
sending to programs, GEN 2-27 
sending to user name, GEN 2-27 
specifying mailbox, GEN 2-36 
terms defined, GEN 2-38 
writing to others online, GEN 2-5 
mail command 
abbreviating, GEN 2-20 
description, GEN 2-31 
uses of, GEN 2-18 
Mail list 
editing, GEN 2-25 
Mail program 
setting up, SYS 5-44 
mail program 
4.2BSD improvement, SYS 1-7 
defined, GEN 4-69 
escaping temporarily to command 
mode, GEN 2-26 
escaping temporarily to shell, 
GEN 2-25 
reading folders, GEN 2-23 
reference manual, GEN 2-17 to 
2-41 


mail program (Cont.) 
senting source program text, GEN 
2-33 
shell and, GEN 2-32 
suspending, GEN 4-37E 
using, GEN 2-17 to 2-41 
Mail Reference Manual 
See also Mail program 
Mail routing facility 
See sendmail 
mail system 
See also sendmail 
MAIL variable 
description, GEN 4-11 
mailaddr 
4.2BSD improvement, SYS 1-17 
Mailbox 
defined, GEN 2-38 
mailre file, GEN 2-21E 
defined, GEN 2-21 
specifying folder directory, GEN 
2-23 
make command 
command line format, PGM 3-16 
operation, PGM 3-16 to 3-17 
make depend command 
system source code and, SYS 5-77 
make directory command 
See mkdir command (C shell) 
make program 
See also makefile 
4.2BSD improvement, SYS 1-7 
C and, GEN 2-15 
defined, GEN 4-69 
description, PGM 3-13 to 3-21 
description file for, PGM 3-18 to 
3-20 
maintaining related files, GEN 
4-53 
operation, PGM 3-13 to 3-15 
suffix list, PGM 3-17 
transformation paths 
summary, PGM 3-17 
warnings, PGM 3-20 
MAKEDEV script 
See also MAKEDEV.local file 
4.2BSD improvement, SYS 1-20 
makefile 
See also make program 
defined, GEN 4-69 
description, GEN 4-53 
modifying for uucp, SYS 5-139 
makefile.vax file 
contents, SYS 5-11 


makelinks command 
source modules and, SYS 5-78 
maketemp command (M4) 
description, PGM 2-396 
man command (Bourne shell) 
printing the UNIX manual, GEN 
4-15 
printing UNIX manual, GEN 
4-16F 
man command (C shell) 
accessing online programmer’s 
manual, GEN 4-63E, 4-69E 
using, GEN 2-6 
Manual 
defined, GEN 4-69 
map command (ex) 
See also unmap command (ex) 
description, GEN 3-90 
Maranzano, J.F., & Bourne, S.R. 
ADB debugging program, PGM 
3-51 to 3-77 
Margin number 
setting, GEN 5-44 
mark command (ex) 
See also k command (ex) 
description, GEN 3-90 
Mass storage 
UNIX interfaces, SYS 1-36 
MASSBUS 
description, SYS 5-18 
specifying, SYS 5-19 
MASTER mode 
description, SYS 5-135 
Mathematics 
text formatting commands for, 
GEN 5-14E 
typesetting, GEN 5-97 to 5-104, 
5-105 to 5-114 
MAXMEM parameter 
description, SYS 5-121 
MAXUMEM parameter 
See also MAXMEM parameter 
description, SYS 5-121 
MAXUPRC parameter 
description, SYS 5-121 
maxusers parameter (config) 
defined, SYS 5-79 
mba.c device driver 
4.2BSD improvement, SYS 5-14 
mbox command (Mail) 
abbreviating, GEN 2-22 
description, GEN 2-31 
saving unread mail, GEN 2-22 
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mbox file 

mail and, GEN 2-20 

system mailbox and, GEN 2-20 
mbuf.h file 

4.2BSD improvement, SYS 5-5 
me command (nroff/troff) 

defined, GEN 5-72 


McKusick, M.K., & Kowalski, T.J. 


fsck, SYS 2-7 to 2-25 
McKusick, M.K., & others 
4,.2BSD System Manual, PGM 
4-15 to 4-52 
Berkeley Pascal User Manual, 
PGM 2-159 to 2-209 
fast file system, SYS 1-23 to 1-38 
McMahon, L.E. 
sed stream editor and, GEN 3-105 
to 3-114 
me macro package. 
initializing, GEN 5-40 
naming convention, GEN 5-39 
predefined strings, GEN 5-47 
reference manual, GEN 5-39 to 
5-48 
Me Reference Manual, GEN 5-39 
See also me macro package 
mem.c file 
4.2BSD improvement, SYS 5-14 
Memorandum 
text formatting commands for, 
GEN 5-14E 
mesg option (ex) 
description, GEN 3-99 
Message 
See also Mail 
defined, GEN 2-38 
Message list 
defined, GEN 2-28, 2-38 
Metacharacters (Bourne shell) 
defined, GEN 4-5 
quoting, GEN 4-5 
quoting a string, GEN 4-5E 
quoting mechanisms, GEN 4-20F 
reference list, GEN 4-27 
Metacharacters (C shell) 
defined, GEN 4-69 
description, GEN 4-32 
reference list, GEN 4-62 
using with command arguments, 
GEN 4-35 
Metacharacters (ed) 
character classes and, GEN 3-41 
deleting, GEN 3-38 
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Metacharacters (ed) (Cont.) 
delimiting text for s command, 
GEN 3-39 
editing with, GEN 3-37 to 3-43 
entering, GEN 3-33 
reference list, GEN 3-33 
searching for, GEN 3-39, 3-41 
Metacharacters (ed) (ed) 
combining, GEN 3-40 
description, GEN 3-38 to 3-42 
Metacharacters (ex) 
X and, GEN 3-96 
Metacharacters (me) 
reference list, GEN 5-47 
Metacharacters (nroff/troff) 
specifying, GEN 5-79 
Metacharacters (troff) 
automatically translated, GEN 
5-86 
command list, GEN 5-96 
entering, GEN 5-86 
metoo option (Mail) 
defined, GEN 2-35 
MFLAGS macro 
supplying flags to make, SYS 1-7 
mille game 
4.2BSD improvement, SYS 1-17 
Mini-root file system 
booting from, SYS 5-25 
copying, SYS 5-24 
Minus sign 
translating for troff, GEN 5-86 
mk command (nroff/troff) 
See also rt command (nroff/troff); 
sp command (nroff/troff) 
defined, GEN 5-60 
mkdir command 
4.2BSD improvement, SYS 1-7 
creating directories, GEN 2-10 
mkdir command (C shell) 
creating a directory, GEN 4-48 
defined, GEN 4-70 
mkdir system call 
4.2BSD improvement, SYS 1-11 
mkey program 
defined, GEN 5-146 
description, GEN 5-147 
mkfs program 
See newfs program 
4.2BSD improvement, SYS 1-20 
mman.h file 
future plans and, SYS 5-5 
Modifier (C shell) 
See also Command substitution 


Modifier (C shell) (Cont.) 
defined, GEN 4-70 
description, GEN 4-57 
restriction, GEN 4-57n 
more program 
defined, GEN 4-70 
paging mail, GEN 2-20 
terminal screen and, GEN 4-37 
Morris, R., & Cherry, L. 
BC and, GEN 2-48 to 2-55 
DC and, GEN 2-57 to 2-64 
Morris, R., & Thompson, K. 
password system, SYS 4-7 to 4-12 
mos 
old version of -ms, GEN 5-17 
Mosher, D., & others 
4.2BSD System Manual, PGM 
4-15 to 4-52 
mount command 
unprivileged users and, SYS 4-5 
mount program 
4.2BSD improvement, SYS 1-20 
mount.h file 
4.2BSD improvement, SYS 5-6 
Move command (ed) 
See m command (ed) 
move command (edit) 
See m command 
move command (ex) 
See m command (ex) 
move routine 
defined, PGM 4-83 
mpx system call 


See socket system call and related 


system calls 
ms macro package 
See also -mos 
4,.2BSD improvement, SYS 1-18 
CAI script for, GEN 6-7 
command reference list, GEN 
5-11 
default settings, GEN 5-9 
entering cover sheet, GEN 5-5 
entering first page, GEN 5-5 
entering page footer, GEN 5-6 
entering page heading, GEN 5-6 
entering paragraphs, GEN 5-5 
entering section heads, GEN 5-6 
keeping text blocks together, GEN 
5-9 
order for input commands, GEN 
5-12F 
preparing documents, GEN 5-13 
to 5-16 


ms macro package (Cont.) 
printing files on the terminal, 
GEN 5-9E 
register name reference list, GEN 
5-11 
revised version, GEN 5-17 to 5-19 
specifying column format, GEN 
5-6 
using with troff and nroff, GEN 
5-5 to 5-12 
ms package 
description, GEN 2-12 
formatting a document with nroff, 
GEN 2-13 
formatting a document with troff, 
GEN 2-12 
MSGBUEFS parameter 
description, SYS 5-122 
mt 
showing state of tape drive, SYS 
1-7 
mtab 
4,2BSD improvement, SYS 1-16 
Multiplication 
DC and, GEN 2-61 
Multiplicative operator 
description, GEN 2-52 
Multitasking 
description, GEN 1-29 
MV command 
renaming a file, GEN 2-7 
mv program 
4.2BSD improvement, SYS 1-7 
mv program (ed) 
renaming a file, GEN 3-47 
mvcur routine 
defined, PGM 4-88 
mvwin routine 
defined, PGM 4-86 


N 


n command (ex) 
description, GEN 3-90 
nh command (sed) 
defined, GEN 3-108 
N command (vi) 
See also n command (vi) 
defined, GEN 3-79 


‘n command (vi) 


See also N command (vi) 
defined, GEN 3-81 

N flag (Mail) 
See also noheader option 
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N flag (Mail) (Cont.) 
defined, GEN 2-36 
n flag (Mail) 
defined, GEN 2-36 
n flag (make) 
defined, PGM 3-17 
n flag (mkey) 
ignoring words, GEN 5-147 
n flag (sed) 
defined, GEN 3-106 
n option 
specifying numeric sort, GEN 4-32 
n option (inv) 
defined, GEN 5-148 
n option (nroff/troff) 
defined, GEN 5-49 
n option (uuclean) 
defined, SYS 5-137 
nl command (me) 
defined, GEN 5-44 
n2 command (me) 
defined, GEN 5-44 
Name label (as) 
defined, GEN 6-55 
NAME operator (C compiler) 
defined, PGM 2-66 
Named expression 
defined, GEN 2-51 
nami routine 
See also nami.h file 
nami.h file 
4.2BSD improvement, SYS 5-5 
NBUF parameter 
description, SYS 5-121 
NCALL parameter 
description, SYS 5-122 
NCARGS parameter 
description, SYS 5-122 
NCLIST parameter 
description, SYS 5-122 
ND command (ms) 
cover sheet and, GEN 5-9 
ne command (nroff/troff) 
defined, GEN 5-59 
NEQN program 
See also EQN program 
description, GEN 5-33 
formatting mathematics, GEN 
2-13 
net library 
4,.2BSD improvement, SYS 1-15 
net program 
UNIX distribution and, SYS 1-7 
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netstat program 
displaying network statistics, SYS 
1-7, 5-51E 
displaying routing table contents, 
SYS 5-51E 
Network 
See Dial-up network 
See uucp system 
troubleshooting, SYS 5-57 
Network data base 
files list, SYS 5-48 
Network library routines 
description, SYS 3-12 to 3-16 
Network name 
represented by netent structure, 
SYS 3-13E 
Network server program 
included with system, SYS 5-50T 
started up automatically at boot 
time, SYS 5-49T 
network server program 
reference list, SYS 5-49 
Network Systems Hyperchannel 
Adapter 
See hy network interface driver 
Networking 
implementation, SYS 3-29 to 3-57 
networks database 
4.2BSD improvement, SYS 1-16 
newfs program 
See also mkfs program 
4.2BSD improvement, SYS 1-18, 
; 1-20 
newgrp command 
See Group set 
newwin routine 
defined, PGM 4-86 
next command (ex) 
See n command (ex) 
next command (Mail) 
abbreviating, GEN 2-31 
description, GEN 2-31 
next statement (awk) 
defined, PGM 3-9 
NF variable (awk) 
determining number of fields, 
PGM 3-6 
NFILE parameter 
description, SYS 5-121 
NH command (ms) 
entering section heads, GEN 5-6E 
specifying numbered section heads, 
GEN 5-6 


nh command (nroff/troff) 
defined, GEN 5-69 
NIC host data base 
retrieving, SYS 5-48E 
NINODE parameter 
description, SYS 5-121 
nl routine 
defined, PGM 4-87 
NLABEL operator (C compiler) 
defined, PGM 2-64 
nm command (nroff/troff) 
defined, GEN 5-70 
NMOUNT parameter 
description, SYS 5-121 
nn command (nroff/troff) 
defined, GEN 5-70 
Nobreak control character 
changing, GEN 5-67 
noclobber variable (C shell) 
defined, GEN 4-70 
protecting files and, GEN 4-41 
NOFILE parameter 
description, SYS 5-121 
noglob variable (C shell), GEN 
4-56E 
defined, GEN 4-70 
noheader option (Mail) 
See also -N flag 
See also quiet option 
defined, GEN 2-35 
nosave option (Mail) 
See also keepsave option 
defined, GEN 2-35 
notify command (C shell) 
See also notify variable 
defined, GEN 4-70 
reporting job complete, GEN 4-47 
notify variable (C shell) 
See also notify command (C shell) 
background jobs and, GEN 4-45 
Nowitz, D.A. 
implementing uucp, SYS 5-131 to 
5-144 
Nowitz, D.A., & Lesk, M.E. 
a dial-up network of UNIX 
systems, SYS 5-123 to 5-129 
np command (me) 
defined, GEN 5-40 
numbering paragraphs 
automatically, GEN 5-31E 
NPROC parameter 
description, SYS 5-121 
nr command (me) 
indenting sections, GEN 5-32K 


nr command (me) (Cont.) 
specifying with li, GEN 5-30 
nr command (nroff/troff) 
defined, GEN 5-65 
NR variable (awk) 
determining current record 
number, PGM 3-5 
nroff text processor 
See also nroff/troff text processor 
See also troff text processor 
calling, GEN 5-21E 
defined, GEN 2-12 
device resolution and, GEN 5-56 
entering text, GEN 5-22 
formatting a document with -ms, 
GEN 2-13 
function, GEN 5-22 
invoking, GEN 5-49 
stopping printer to change paper, 
GEN 5-49 
writing papers using -me, GEN 
5-21 to 5-38 
nroff/troff text processor 
See also -ms macros 
See also nroff text processor 
See also troff text processor 
-ms macros and, GEN 5-5 to 5-12 
boxing words, GEN 5-69 
breaking a line, GEN 5-60 
character set, GEN 5-57 
character translation, GEN 5-66 
concealed newlines and, GEN 
5-67 
contol characters beginning lines, 
GEN 5-60 
defined, GEN 5-49 
description, GEN 2-12 
error messages, GEN 5-73 
input, GEN 5-56 
justifying text, GEN 5-61 
marking horizontal space, GEN 
5-68 
numbering output lines, GEN 
5-70 
numerical expressions, GEN 5-57 
numerical parameters, GEN 5-56_ 
post processors and, GEN 5-50 
preprocessors and, GEN 5-50 
specifying conditional input, GEN 
5-71 
specifying indention, GEN 5-62 
specifying line length, GEN 5-62 
specifying page margins, GEN 
5-74E 
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nroff/troff text processor (Cont.) 
specifying vertical spacing, GEN 
5-61 
switching environment, GEN 5-71 
transparent throughput, GEN 
5-67 
transposing characters, GEN 5-67 
underlining words, GEN 5-69 
user’s manual, GEN 5-49 to 5-81 
writing paragraph macros, GEN 
5-75E 
Nroff/Troff User’s Manual 
update, GEN 5-81 
Nroff/Troff User’s Manual, GEN 
5-49 to 5-81 
See also nroff/troff text processor 
ns command (nroff/troff) 
defined, GEN 5-62 
NTEXT parameter 
description, SYS 5-122 
nu command (edit) 
printing text with line numbers, 
GEN 3-11 
nu command (ex) 
description, GEN 3-91 
NULL 
defined, PGM 1-21 
NULL operator (C compiler) 
defined, PGM 2-66 
Null statement (as) 
defined, GEN 6-55 
Number 
internal representation in DC, 
GEN 2-59 
right justifying with troff, GEN 
5-87 
number command (DC) 
descripton, GEN 2-57 
number command (edit) 
See nu command (edit) 
number command (ex) 
See nu command (ex) 
number option (ex) 
description, GEN 3-99 
Number register (nroff/troff) 
See also nr command (nroff/troff) 
See also specific registers 
command list, GEN 5-52, 5-55 
description, GEN 5-65 to 5-66 
Number register (troff) 
description, GEN 5-91 to 5-92 
predefined, GEN 5-91 
Numeric label (as) 
defined, GEN 6-55 
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nx command (nroff/troff) 
defined, GEN 5-72 


O 


o command (DC) 
changing the output base, GEN 
2-62 
description, GEN 2-59 
o command (ex) 
See also open option 
description, GEN 3-91 
line editing and, GEN 3-85 
o command (nroff/troff) 
description, GEN 5-68 
O command (Rogue) 
using, GEN 6-23 
O command (vi) 
See also o command (vi) 
See also slowopen option 
defined, GEN 3-79 
0 command (vi) 
See also O command (vi) 
defined, GEN 3-81 
o option (hunt) 
defined, GEN 5-148 
o option (nroff/troff) 
defined, GEN 5-49 
obase 
defined, GEN 2-44, 2-51 
Octal 
converting to decimal, GEN 2-44 
od 
4.2BSD improvement, SYS 1-7 
of command (me) 
defined, GEN 5-41 
of filter 
calling, PGM 4-102E 
printers and, PGM 4-102 
OF macro 
specifying page footers, GEN 5-19 
OFS variable 
defined, PGM 3-6 
oh command (me) 
defined, GEN 5-41 
OH macro 
specifying page headings, GEN 
5-19 
oldesh 
4.2BSD and, SYS 1-7 
onintr command (C shell) 
See also Interrupt signal 
defined, GEN 4-70 


open command (ex) © 
See 0 command ex) 
open function 
See also open function 
description, PGM 1-10 
open option (ex) 
description, GEN 3-99 
open system call 
4.2BSD improvement, SYS 1-11 
Operators 
available, GEN 2-43 
optim routine (C compiler) 
description, PGM 2-66 to 2-67 
optim routine (C shell) 
See also unoptim routine (C shell) 
optimize option (ex) 
description, GEN 3-99 
Option (C shell) 
combining, GEN 2-6 
Option (ex) 
See also specific options 
reference list, GEN 3-97 to 3-101 
Option (Mail) 
See also specific options 
defined, GEN 2-38 
reference list, GEN 2-33 to 2-36, 
2-40T 
setting, GEN 2-32, 2-32K 
Option (nroff/troff) 
invoking, GEN 5-50 
reference list, GEN 5-49 to 5-50 
Option (vi) 
See also specific options 
listing values, GEN 3-65 
reference list, GEN 3-65 
setting, GEN 3-65 
setting automatically, GEN 3-65 
options parameter (config) 
defined, SYS 5-79 
ORS variable 
defined, PGM 3-6 
os command (nroff/troff) 
defined, GEN 5-62 
Ossanna, J.F. 
Nroff/Troff User’s Manual, GEN 
5-49 to 5-81 
Out of band data 
description, SYS 3-23 
flushing I/O on receipt, SYS 
3-23F 
Output 
defined, GEN 4-70 
Output base 
DC and, GEN 2-62 


over keyword (EQN) 

specifying fractions, GEN 5-99K 
overlay routine 

defined, PGM 4-83 
Overstrike command (nroff/troff) 

See o command (nroff/troff) 
Overstriking 

creating with troff, GEN 5-88 
overwrite routine 

defined, PGM 4-83 


P 


p command (DC) 
descripton, GEN 2-58 
p command (ed) 
defined, GEN 3-34 
printing a line, GEN 3-28 
printing all lines, GEN 3-28 
printing last line, GEN 3-28 
printing lines, GEN 3-27 
stopping, GEN 3-28 
using, GEN 3-27 to 3-28 
p command (edit) 
printing buffer contents, GEN 
3-10 
u command and, GEN 3-16 
p command (ex) 
description, GEN 3-91 
P command (me) 
defined, GEN 5-46 
specifying front matter, GEN 5-33 
p command (sed) 
defined, GEN 3-111 
P command (vi) 
See also p command (vi) 
defined, GEN 3-79 
p command (vi) 
See also P command (vi) 
defined, GEN 3-81 
p escape (Mail) 
description, GEN 2-24 
p flag (make) 
defined, PGM 3-17 
p flag (sed) 
defined, GEN 3-110 
p macro (me) 
defined, GEN 5-41 
P number register (nroff/troff) 
defined, GEN 5-81 
p option (hunt) 
defined, GEN 5-149 
p option (inv) 
defined, GEN 5-148 
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p option (troff) 
defined, GEN 5-50 
p option (uuclean) 
defined, SYS 5-137 
pa command (me) 
defined, GEN 5-44 
pac program 
4.2BSD improvement, SYS 1-18, 
1-20 
Page 
command list, GEN 5-51 
formatting the last page with a 
macro, GEN 5-77E 
printing specific, GEN 5-49 
setting margins with nroff/troff, 
GEN 5-74E 
specifying blank, GEN 5-44 
specifying new, GEN 5-23 
Page commands 
description, GEN 5-59 
Page footer 
entering in text file, GEN 5-6 
specifying, GEN 5-70 
specifying for multiple columns 
with a macro, GEN 5-75E 
specifying with troff, GEN 5-91 
varying on alternate pages, GEN 
5-19 
Page header 
entering in text file, GEN 5-6 
specifying for multiple columns 
with a macro, GEN 5-75E 
specifying formats for alternating, 
GEN 5-71 
specifying with troff, GEN 5-90 
Page heading 
specifying, GEN 5-70 
varying on alternate pages, GEN 
5-19 
Page layout 
specifying, GEN 5-23 
Page number 
setting arabic, GEN 5-44 
setting roman, GEN 5-44 
specifying, GEN 5-59, 5-91 
specifying for appendix, GEN 5-46 
specifying for chapter, GEN 5-46 
Page offset (nroff/troff) 
specifying, GEN 5-59 
Page trap (nroff/troff) 
description, GEN 5-64 
pagesize program 
printing system page size, SYS 
1-7 
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Paging 
defined, GEN 3-13 
versus scrolling, GEN 3-56 
Paper 
formatting, GEN 5-34F 
Paragraph, GEN 5-40 
~me restrictions, GEN 5-40 
creating decorative initial capital 
with troff, GEN 5-86 
editing with vi, GEN 3-61 
entering in text file, GEN 5-5 
indenting, GEN 5-7 to 5-8 
numbering automatically, GEN 
5-31 
specifying, GEN 5-22 
specifying block format, GEN 
5-29 
specifying hanging indent format, 
GEN 5-29 
specifying hanging indent format 
with a macro, GEN 5-75E 
specifying indention, GEN 5-30 
specifying indention amount, 
GEN 5-39E 
vi definition, GEN 3-61 
writing a macro for, GEN 5-75E 
paragraph option (ex) 
description, GEN 3-99 
param.c file 
contents, SYS 5-11, 5-103 
param.h file 
See also kernel.h file 
4.2BSD improvement, SYS 5-6, 
5-138 
Parentheses (BC) 
primitive expression and, GEN 
2-51 
Parentheses (EQN) 
typesetting in proper size, GEN 
5-100E 
Pascal programming language 
See Berkeley Pascal programming 
language 
Passive system 
defined, SYS 5-123 
passwd 
concurrent updates to password 
file and, SYS 1-8 
Password 
entering, GEN 3-5 
Password entry program 
predictable passwords and, SYS 
4-10 
random numbers and, SYS 4-11 


Password file 
restricting users, GEN 1-31 
security and, SYS 4-8 
Password system 
history, SYS 4-7 to 4-12 
Pasting and cutting 
See m command (ed) 
PATH variable (Bourne shell) 
description, GEN 4-11 to 4-12 
path variable (C shell) 
See also rehash command (C 
shell) 
default value, GEN 4-40 
defined, GEN 4-40, 4-70 
Pathname 
See also Absolute pathname 
defined, GEN 2-9, 4-71 
description, GEN 4-33 
Pattern (awk) 
description, PGM 3-6 to 3-7 
Pattern space 
defined, GEN 3-106 
pe 
4.2BSD improvement, SYS 1-8 
pe command (nroff/troff) 
defined, GEN 5-70 
pe/pi 
4.2BSD improvement, SYS 1-8 
peb.h file 
4.2BSD improvement, SYS 5-14 
pel network interface driver 
4.2BSD improvement, SYS 1-16 
pd command (me) 
defined, GEN 5-43 
pdx debugger 
pi and, SYS 1-8 
Period 
See Dot character (ed) 
perror function 
description, PGM 1-12 
perror library 
4.2BSD improvement, SYS 1-15 
pg flag 
collecting information for gprof, 
SYS 1-5 
pg option 
creating images for gprof, SYS 1-6 
phones database 
See also tip program 
4.2BSD improvement, SYS 1-17 
Phototypesetter 
defined, GEN 5-98 
stopping automatically to reload, 
GEN 5-49 


Phototypesetting 
See nroff/troff text processor 
PHYSPAGES parameter 
description, SYS 5-121 
pi command (nroff) 
defined, GEN 5-72 
Picture System 2 graphics device 
See ps driver 
piles program (EQN) 
description, GEN 5-100 
Pipe 
defined, GEN 1-26, 2-11, PGM 
1-14 
description, GEN 2-11, PGM 1-14 
to 1-17 
optimal size, SYS 1-28 
programs and, GEN 2-11 
pipe system call 
description, PGM 1-15 to 1-17 
Pipeline, GEN 4-4E 
combining command input/output, 
GEN 4-32 
defined, GEN 2-11, 4-4, 4-71 
description, GEN 4-32 to 4-33 
elements in, GEN 2-11 
files read from terminal and, GEN 
2-11 
pl command (nroff/troff) 
defined, GEN 5-59 
Plain data block 
defined, SYS 2-12 
pm command (nroff/troff) 
defined, GEN 5-73 
pn command (nroff/troff) 
defined, GEN 5-59 
po command (nroff/troff) 
defined, GEN 5-59 
setting left margin, GEN 5-86E 
Point size 
changing, GEN 5-38, 5-58 
defaults, GEN 5-38 
setting, GEN 5-84 
pop directory command 
See popd command (C shell) 
popd command (C shell) 
See also pushd command (C shell) 
defined, GEN 4-71 
without argument, GEN 4-49 
Port 
defined, GEN 4-71 
Port number 
algorithm for selecting, SYS 3-26 
overriding selection algorithm, 
SYS 3-26E 
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Portable C Compiler 
description, PGM 2-37 to 2-61 
Posting file 
defined, GEN 5-145 
Pound sign 
See Sharp character 
pp command (me) 
See also ip command (me) 
See also lp command (me) 
defined, GEN 5-40 
description, GEN 5-22 
meaning of, GEN 2-12 
pr command (C shell) 
defined, GEN 4-71 
printing files, GEN 2-7 
printing files in three columns, 
GEN 2-11 
pre command (edit) 
recovering files, GEN 3-22 
Preface 
formatting, GEN 5-34F 
Preliminary text 
See Front matter 
preserve command (edit) 
See pre command (edit) 
preserve command (ex) 
description, GEN 3-91 
preserve command (Mail) 
See also hold command (Mail) 
abbreviating, GEN 2-22 
description, GEN 2-31 
keeping mail in your system 
mailbox, GEN 2-21 
primes program 
4.2BSD improvement, SYS 1-17 
Primitive expression 
description, GEN 2-51 
Print command 
See p command 
print command (awk) 
description, PGM 3-6 
print command (edit) 
See p command (edit) 
print command (ex) 
See p command (ex) 
print command (Mail) 
See also ignore command (Mail) 
description, GEN 2-29 
ignored fields and, GEN 2-31 
Print file 
UNIX and, PGM 2-83 
print working directory command 
See pwd command (C shell) 
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printcap file 
4.2BSD improvement, SYS 1-17 
creating, PGM 4-101 
printenv command (C shell) 
See also setenv command (C 
shell) 
defined, GEN 4-71 
printf function 
See also fprintf function 
output and, PGM 1-4 
printf statement (awk) 
formatting output, PGM 3-6 
printw routine 
defined, PGM 4-83 
proc.h file 
4,2BSD improvement, SYS 5-7 
Process 
See also ps command (C shell) 
See also System process 
See also User process 
defined, GEN 1-26, 4-71 
maximum active, SYS 5-121 
maximum per user, SYS 5-121 
setting maximum files for, SYS 
5-121 
space for, SYS 5-121 
stopping, GEN 2-11 
syncronizing, GEN 1-27 
terminating, GEN 1-27 
Process control 
data structure, PGM 4-6F 
description, PGM 4-5 to 4-6 
Process number 
defined, GEN 2-11 
determining, GEN 2-11 
Process stack 
setting growth increment, SYS 
5-121 
setting initial size, SYS 5-121 
Process time accounting 
summarizing, SYS 5-56 
PROFIL operator (C compiler) 
defined, PGM 2-65 
profil system call 
4.2BSD improvement, SYS 1-12 
profile file 
login and, GEN 4-6 
shell and, GEN 2-12 
Profiled system 
description, SYS 5-78 
PROG operator (C compiler) 
defined, PGM 2-64 
Program 
See also Command (C shell) 


Program (Cont.) 
defined, GEN 3-3, 4-71 
editing with vi, GEN 3-67 
executing, GEN 1-26 
executing from another, PGM 
1-12 
maintaining with make, PGM 
3-13 to 3-21 
running simultaneously, GEN 
2-11 
running two with one command 
line, GEN 2-11 
saving output, GEN 2-11 
setting maximum executing, SYS 
5-122 
stopping, GEN 2-4, 2-11 
Programmer’s manual 
See Manual 
Programming 
reading list, GEN 2-16 
tools for, GEN 2-14 to 2-15 
translating a language, GEN 2-15 
Prompt 
defined, GEN 4-71 
Prompt character 
defined, GEN 2-4 
prompt option (ex) 
description, GEN 3-99 
Protection mode 
description, PGM 1-10 
Proteon proNET ring network 
controller 
See vv network interface driver 
Protocol name 
represented by protoent structure, 
SYS 3-13, 3-14E 
protocol switch table 
See also protosw.h file 
protocols database 
4.2BSD improvement, SYS 1-17 
protosw.h file 
4.2BSD improvement, SYS 5-5 
ps command (C shell) 
See also Process 
4.2BSD improvement, SYS 1-8 
defined, GEN 4-72 
determining the process number, 
GEN 2-11 
displaying all programs running, 
GEN 2-11 
displaying unstarted background 
jobs, GEN 4-48 
ps command (troff) 
defined, GEN 5-58 


ps command (troff) (Cont.) 
setting point size, GEN 5-84 
ps driver 
4.2BSD improvement, SYS 1-16 
ps.c device driver 
4.2BSD improvement, SYS 5-12 
PS1 variable 
defined, GEN 4-12 
PS2 variable 
defined, GEN 4-12 
Pseudo device 
specifying, SYS 5-82 
Pseudo terminal 
creating, SYS 5-48E 
description, SYS 3-24 
remote login sessions and, SYS 
3-24 
Pseudo-font 
description, GEN 5-37 
restriction, GEN 5-37 
psignal library 
4.2BSD improvement, SYS 1-15 
pstat program 
4.2BSD improvement, SYS 1-20 
ptx program 
defined, GEN 2-13 
pty driver 
4.2BSD improvement, SYS 1-16 
pu command (ex) 
description, GEN 3-91 
Publication list 
indexing, GEN 5-148 to 5-155 
updating, GEN 5-155 to 5-162 
pup__cksum.c file 
4,.2BSD improvement, SYS 5-13 
purchar function 
output and, PGM 1-4 
push directory command 
See pushd command (C shell) 
push directory command (C shell) 
See pushd command 
pushd command (C shell) 
See also cd command (C shell) 
See also popd command (C shell) 
defined, GEN 4-70 
saving name of previous directory, 
GEN 4-49 
without argument, GEN 4-49 
put command (ex) 
See pu command (ex) 
pute macro 
See also fflush function 
defined, PGM 1-6 
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pwd command (C shell) 
See also dirs command (C shell) 
4.2BSD improvement, SYS 1-8 
defined, GEN 4-72 
print your directory name, GEN 
2-9 
working directory pathname and, 
GEN 4-48E 
PX macro 
description, GEN 5-18 


Q 


Q command 
quitting ed, GEN 2-6 
q command (DC) 
descripton, GEN 2-58 
q command (ed) 
defined, GEN 3-34 
using, GEN 3-26 
q command (edit) 
exiting without saving edits, GEN 
3-13 
using, GEN 3-8 
q command (ex) 
See also wq command (ex) 
description, GEN 3-91 
q command (me) 
defined, GEN 5-42, 5-44 
entering, GEN 5-25 
specifying quoted text, GEN 5-38 
q command (sed) 
defined, GEN 3-114 
Q command (vi) 
defined, GEN 3-79 
q flag (make) 
defined, PGM 3-17 
q option (nroff/troff) 
defined, GEN 5-49 
qsort library 
4.2BSD improvement, SYS 1-15 
Question mark character (C shell) 
description, GEN 4-34 
Question mark character (DC) 
description, GEN 2-59 
pattern matching and, GEN 2-8 
Question mark character (ed) 
context search and, GEN 3-43 
quiet option (Mail) 
See also noheader option 
defined, GEN 2-35 
Quit command (ed) 
See q command (ed) 
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quit command (edit) 
See q command (edit) 
quit command (ex) 
See q command (ex) 
quit command (Mail) 
abbreviating, GEN 2-22 
description, GEN 2-31 
saving typed mail, GEN 2-22 
Quit signal 
defined, GEN 4-72 
terminating a program, GEN 4-37 
quit statement (BC) 
description, GEN 2-55 
quot program 
4.2BSD improvement, SYS 1-20 
Quota 
exceeding, GEN 3-22 
Quota file 
comparing with allocated disk 
space, SYS 2-4 
description, SYS 2-5 
Quota system 
See Disk quota system 
quota system call 
4.2BSD improvement, SYS 1-12 
quota.h file 
4,.2BSD improvement, SYS 5-5 
quota__kern.c file 
contents, SYS 5-9 
quota__subr.c file 
contents, SYS 5-9 
quota__sys.c file 
contents, SYS 5-9 
quota__ufs.c file 
contents, SYS 5-9 
quotacheck program 
4.2BSD improvement, SYS 1-20 
quotaon program 
See also quotaoff 
4,.2BSD improvement, SYS 1-20 
Quotation 
defined, GEN 4-72 
setting apart, GEN 5-25 
Quotation marks (C shell) 
using metacharacters in command 
arguments, GEN 4-35 
Quotation marks (me) 
making compatible for printers 
and typesetters, GEN 5-38 
translating for typesetter, GEN 
5-38 | 
Quotation marks (ms) 
translating for typesetter, GEN 
5-19 


Quotation marks (nroff) 
specifying font, GEN 5-36 

Quotation marks (troff) 
translating, GEN 5-86 

Quoted string statement (BC) 
forming, GEN 2-54 


R 


r command (ed) 
defined, GEN 3-34 
using, GEN 3-27 
without line address, GEN 3-49 
r command (edit) 
description, GEN 3-22 
r command (ex) 
description, GEN 3-91 
r command (me) 
defined, GEN 5-44 
specifying roman font, GEN 5-36 
R command (ms) 
restoring regular font, GEN 5-8 
r command (sed), GEN 3-112E 
defined, GEN 3-112 
R command (vi) 
See also r command (vi) 
defined, GEN 3-79 
r command (vi) 
See also R command (vi) 
uefind, GEN 3-81 
r escape (Mail) 
description, GEN 2-24 
r flag (cp) 
file system tree and, SYS 1-5 
r flag (Mail) 
defined, GEN 2-36 
r flag (make) 
defined, PGM 3-17 
‘Yr modifier (C shell) 
extracting filename root, GEN 
4-57E 
r option (edit) 
recovering files, GEN 3-23 
r option (nroff/troff) 
defined, GEN 5-49 
r option (uucp) 
defined, SYS 5-132 
r option (uux) 
description, SYS 5-133 
RA60 disk drive 
See uda driver 
RA80 disk drive 
See uda driver 


RA81 disk drive 
See uda driver 
Rand MH system 
mail program and, SYS 1-7 
random library 
4.2BSD improvement, SYS 1-15 
Ratfor language 
See also EFL programming 
language 
See also M4 macro processor 
C and, GEN 2-15 
description, PGM 2-111 to 2-122 
Raw device 
description, SYS 5-20 
raw routine 
defined, PGM 4-85 
Raw socket 
See also Datagram socket 
defined, SYS 3-6 
rb command (me) 
defined, GEN 5-44 
RC command (me) 
defined, GEN 5-46 
re program 
4.2BSD improvement, SYS 1-20 
rcexpr routine 
arguments, PGM 2-68 
rcp program 
cp support and, SYS 1-8 
rd command (nroff/troff) 
defined, GEN 5-72 
rdump program 
See also rmt program 
4.2BSD improvement, SYS 1-18, 
1-20 
re command (me) 
defined, GEN 5-45 
Read command (ed) 
See r command (ed) 
read command (edit) 
See r command (edit) 
read command (ex) 
See r command (ex) 
read function 
description, PGM 1-9 
Read only mode (ex) 
description, GEN 3-85 
read system call 
4.2BSD improvement, SYS 1-12 
Read-ahead 
description, GEN 2-4 
readlink system call 
4.2BSD improvement, SYS 1-12 
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readv system call 
4.2BSD improvement, SYS 1-12 
record option (Mail) 
defined, GEN 2-35 
recover command (edit) 
description, GEN 3-22 
recover command (ex) 
description, GEN 3-92 
recv system call 
4.2BSD improvement, SYS 1-12 
previewing data, SYS 3-10 
transferring data, SYS 3-9E 
recvfrom system call 
4.2BSD improvement, SYS 1-12 
receiving data, SYS 3-10E 
recvmsg system call 
See also sendmsg system call 
4,2BSD improvement, SYS 1-12 
Redirection 
defined, GEN 4-72 
redraw option (ex) 
description, GEN 3-99 
refer program 
See also Refer system.if ref 
output, GEN 5-152E 
placing a reference in a paper, 
GEN 5-150 
Refer system 
See also addbib utility 
See also Indexing 
4.2BSD improvement, SYS 1-8 
description, GEN 5-133 to 5-142 
formatting bibliographic citations, 
GEN 2-13 
Reference 
formatting, GEN 5-151 
overriding numbering, GEN 5-155 
private file of, GEN 5-155 
Reference file 
defined, GEN 5-151 
refresh routine 
defined, PGM 4-83 
Register 
changing for text formatting, GEN 
5-16 
used by -ms 
reference list, GEN 5-11 
regtab table 
defined, PGM 2-68 
Regular expression (ex) 
defined, GEN 3-96 
description, GEN 3-96 to 3-97 
reference list, GEN 3-96 
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rehash command (C shell) 
See also path variable 
adding commands to directory 
and, GEN 4-40 
defined, GEN 4-72 
required for current path, GEN 
4-51 
Reiser, J.F., & Henry. R.R. 
Berkeley VAX/UNIX Assembler 
Reference Manual, PGM 4-53 
to 4-65 
Reiser, J.F., & London, T.B. 
regenerating system software, SYS 
5-117 to 5-122 
setting up UNIX/32V V1.0, SYS 
5-107 to 5-115 
Relational operator 
description, GEN 2-53 
form, GEN 2-47 
Relative pathname 
See also Absolute pathname 
defined, GEN 4-72 
Reliably delivered message socket 
(unsupported) 
defined, SYS 3-6 
Remainder 
DC and, GEN 2-61 
remap option (ex) 
description, GEN 3-99 
remote database 
See also tip program 
4.2BSD improvement, SYS 1-17 
Remote login program, SYS 3-15F 
Remote login server program 
main loop, SYS 3-18F 
pseudo terminals and, SYS 3-24 
Remote system 
calling, SYS 5-125 
rename system call 
4,.2BSD improvement, SYS 1-12 
description, SYS 1-35 
renice program 
4.2BSD improvement, SYS 1-20 
reorder routine 
description, PGM 2-76 to 2-77 
repeat command (C shell) 
defined, GEN 4-72 
repeating a command, GEN 4-51 
Reply command (Mail) 
See also reply command (Mail) 
abbreviating, GEN 2-20 
answering mail, GEN 2-19 
answering the sender only, GEN 
2-20 


Reply command (Mail) (Cont.) 
definition, GEN 2-29 
reply command (Mail) 
See also Reply command (Mail) 
description, GEN 2-32 
report option (ex) 
description, GEN 3-100 
repquota program 
4.2BSD improvement, SYS 1-20 
Request (nroff) 
See Command (nroff) 
Reserved word 
reference list, GEN 4-27 
reset command 
include file and, SYS 1-8 
resource.h file 
4.2BSD improvement, SYS 5-5 
restart command (lpc) 
description, PGM 4-103 
restor program 
See restore program 
restore program 
See also rrestore 
4,.2BSD improvement, SYS 1-18 
restore server program 
See also tar program 
RETRN operator (C compiler) 
defined, PGM 2-65 
RETURN key 
commands and, GEN 2-4 
description, GEN 3-55 
moving the cursor in vi, GEN 
3-57 
return statement (BC) 
form of, GEN 2-46 
forming, GEN 2-55 
rew command (ex) 
description, GEN 3-92 
rewind command (ex) 
See rew command (ex) 
rexecd server program 
4.2BSD improvement, SYS 1-20 
rhosts file 
description, SYS 5-49 
Ritchie, D.M. 
C Programming Language 
Reference Manual, The, PGM 
2-5 to 2-35 
I/O system, PGM 4-67 to 4-73 
standard I/O library, PGM 1-21 to 
1-24 
system security, SYS 4-3 to 4-5 
tour through C compiler, PGM 
2-63 to 2-77 


Ritchie, D.M. (Cont.) 
UNIX Assembler Reference 
Manual, GEN 6-53 to 6-64 
Ritchie, D.M., & Kernighan, B.W. 
M4 macro processor, PGM 2-393 
to 2-398 
programming UNIX, PGM 1-3 to 
1-24 
Ritchie, D.M., & Thompson, K. 
implementation of file system and 
user command interface, GEN 
1-19 to 1-34 
rk.c device driver 
4,.2BSD improvement, SYS 5-12 
RKO7 disk 
See va driver 
rl option (uucico) 
defined, SYS 5-135 
rl.c device driver 
4.2BSD improvement, SYS 5-12 
RL11 controller 
See rl.c device driver 
RLABEL operator (C compiler) 
defined, PGM 2-65 
rlogin server program 
login file and, SYS 1-7 
cu program and, SYS 1-8 
description, SYS 1-8 
rlogind server program 
4.2BSD improvement, SYS 1-20 
rm command (nroff/troff) 
defined, GEN 5-64 
rm command (shell) 
deleting files, GEN 2-7 
recover command (edit) and, GEN 
3-22 
removing a file, GEN 3-48E 
rmdir command 
4.2BSD improvement, SYS 1-8 
rmdir system call 
4.2BSD improvement, SYS 1-12 
rmt program 
4,.2BSD improvement, SYS 1-20 
rn command (nroff/troff) 
defined, GEN 5-64 
RNAME operator (C compiler) 
defined, PGM 2-65 
ro command (me) 
defined, GEN 5-44 
roffbib program 
bibliographic databases and, SYS 
1-8 
rogue game 
4.2BSD improvement, SYS 1-17 
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rogue game (Cont.) 
command reference list, GEN 
6-19 to 6-21 
displaying top players, GEN 6-25 
fighting, GEN 6-21 
objects you can find, GEN 6-21 
option reference list, GEN 6-24 
playing, GEN 6-17 to 6-25 
rooms, GEN 6-21 
sample screen, GEN 6-18F 
scoring, GEN 6-24 
screen layout, GEN 6-18 to 6-19 
screen symbol reference list, GEN 
6-19 
setting options, GEN 6-23 
ROGUEOPTS variable 
using, GEN 6-23 
Roman number 
setting page number, GEN 5-44 
specifying for front matter, GEN 
5-33 
Root directory 
defined 
description, GEN 1-21 
Root file system 
block size, SYS 5-40 
dump and, SYS 5-54 
rebuilding, SYS 5-32 
restoring, SYS 5-26 
route program 
4.2BSD improvement, SYS 1-20 
description, SYS 5-51 
routed server program 
4.2BSD improvement, SYS 1-20 
description, SYS 5-51 
RP command (ms) 
specifying cover sheet, GEN 5-5 
RP06 disk 
bad block forwarding support, 
SYS 1-18 
rr command (nroff/troff) 
defined, GEN 5-66 
rrestore program 
See also rmt program 
4.2BSD improvement, SYS 1-20 
RS command (ms) 
specifying indention level, GEN 
5-7 
rs command (nroff/troff) 
defined, GEN 5-62 
RS variable (awk) 
defined, PGM 3-6 
rsh command 
See also rshd server program 
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rsh server program 

executing remote commands, SYS 

1-8 
rshd server program 

4.2BSD improvement, SYS 1-20 

rsp.h file 

4.2BSD improvement, SYS 5-13 

rt command (nroff/troff) 

See also mk command 
(nroff/troff); sp command 
(nroff/troff) 

defined, GEN 5-60 

RUBOUT character 
ignoring while sending mail, GEN 
2-34 
RUBOUT key 
See DELETE key 
Ruling 

specifying, GEN 5-88 

specifying for figure, GEN 5-45 

specifying in text, GEN 5-26 

with tab character, GEN 5-87E 

Ruling (nroff/troff) 
outside text margin, GEN 5-72 
Running foot 
See Page footer 
Running head 
See Page header 
Runtime routine (C) 

handling network addresses and 
values, SYS 3-15T 

ruptime program 

See also rwhod server program 

displaying status for cluster, SYS 
1-8 

output, SYS 3-20E 

rwho program 

See also rwhod server program 

displaying users on clusters, SYS 
1-8 

rwho server program 

description, SYS 3-20 to 3-22 

simplified form, SYS 3-21F 

rwhod server program 

4.2BSD improvement, SYS 1-21 

rx driver 

4.2BSD improvement, SYS 1-16 

rx.c device driver 

4.2BSD improvement, SYS 5-12 

RX02 floppy disk unit 
See rx driver 
rxl flag (me) 
setting 12 pitch, GEN 5-39 


RX211 floppy disk controller 
See rx.c device driver 
rxformat program 
4,.2BSD improvement, SYS 1-21 


S 


s command (DC) 
affecting register content, GEN 
2-62 
descripton, GEN 2-58 
destructive, GEN 2-63 
programming DC, GEN 2-62 
s command (ed) 
ampersand character and, GEN 
3-34 
breaking lines, GEN 3-42 
changing all occurrences, GEN 
3-30 
changing every occurrence, GEN 
3-38E 
defined, GEN 3-34 
deleting text, GEN 3-30 
delimiters, GEN 3-30 
description, GEN 3-37 to 3-38 
g command and, GEN 3-46E 
g command restriction and, GEN 
3-47 
rearranging a line, GEN 3-43 
undoing the last substitution, 
GEN 3-38 
using, GEN 3-29 
s command (edit) 
replacing text, GEN 3-11 
uppercase letters and, GEN 3-19 
s command (ex) 
See also & command (ex) 
description, GEN 3-92 
S command (vi) 
defined, GEN 3-79 
s command (vi) 
defined, GEN 3-81 
s escape (Mail) 
description, GEN 2-25 
s flag (In) 
creating symbolic links, SYS 1-7 
s flag (Mail) 
defined, GEN 2-36 
s flag (make) 
defined, PGM 3-17 
s flag (mkey) 
ignoring labels, GEN 5-147 
s macro (me) 
defined, GEN 5-438 


s option (nroff/troff) 
defined, GEN 5-49 
s option (uucico) 
defined, SYS 5-135 
s option (uucp) 
defined, SYS 5-132 
s option (uulog) 
defined, SYS 5-137 
sail game 
4.2BSD improvement, SYS 1-17 
save command (Mail) 
See also write command (Mail) 
abbreviating, GEN 2-32 
system mailbox and, GEN 2-23 
SAVE operator (C compiler) 
defined, PGM 2-65 
savehist variable 
saving history across terminal 
sessions, SYS 1-5 
savetty routine 
defined, PGM 4-88 
sc command (me) 
defined, GEN 5-47 
Scale 
defined, GEN 2-45, 2-51 
increasing value, GEN 2-45E 
limits, GEN 2-45 
printing current value, GEN 
2-45E 
rules for, GEN 2-45 
Scale factor 
defined, GEN 2-59 
Scale indicator 
attaching to numbers for troff, 
GEN 5-92 
Scale register 
description, GEN 2-60 
Scaling 
BC language and, GEN 2-45 
scanf function 
See also fscanf function 
input and, PGM 1-4 
scanw routine 
defined, PGM 4-85 
SCCS 
introduction, PGM 3-23 to 3-37 
Schmidt, E., & Lesk, M.E. 
Lex program generator, PGM 
3-118 to 3-125 
Scratch character 
creating a scratch file, GEN 4-31 
Scratch file 
creating, GEN 4-31 
defined, GEN 4-72 


Index-55 


Scratch file (Cont.) 
Fortran and, PGM 2-83 
Screen (Screen package) 
defined, PGM 4-75 
updating, PGM 4-92E 
updating, PGM 4-76 to 4-77 
Screen (vi) 
breaking lines at right margin, 
GEN 3-67 
controlling window size, GEN 
3-65 
refreshing, GEN 3-64 
Screen editor 
invoking from Mail, GEN 2-24 
screen option (Mail) 
defined, GEN 2-35 
Screen package 
description, PGM 4-75 to 4-98 
input functions, PGM 4-78 
reference list, PGM 4-84 to 4-85 
miscellaneous functions 
reference list, PGM 4-85 to 4-88 
output functions, PGM 4-78 
reference list, PGM 4-80 to 4-84 
prerequisites, PGM 4-75 
starting, PGM 4-77 
terminal information and, PGM 
4~—79 
Script 
See also Script file 
script 
4,.2BSD improvement, SYS 1-8 
Script file, GEN 4-55K 
See also Login shell 
See also make command (C shell) 
break statement and, GEN 4-58 
commands useful to writers of, 
GEN 4-53 
comments in, GEN 4-59 
creating, GEN 2-10, 3-52E 
defined, GEN 3-51, 4-53, 4-72 
interrupts and, GEN 4-59 
invoking, GEN 4-53 
making executable, GEN 4-53 
preventing variable substitution 
by the shell, GEN 4-59 
shell input and, GEN 4-58 
Script.out file 
creating, GEN 2-11 
scroll routine 
defined, PGM 4-88 
Scrolling 
versus paging, GEN 3-56 
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scrollok routine 
defined, PGM 4-87 
sdb symbolic debugger 
See also dbx symbolic debugger 
accessing symbol information, 
SYS 1-5 
locating, SYS 1-8 
support, SYS 1-6 
search command (edit) 
See Context search (edit) 
Search path 
See PATH variable 
Section 
editing with vi, GEN 3-61 
indenting, GEN 5-32E 
vi definition, GEN 3-62 
Section head 
coordinating numbers with 
chapter numbers, GEN 5-41 
entering in text file, GEN 5-6 
indenting, GEN 5-7E 
numbering automatically, GEN 
5-31 to 5-32, 5-40 to 5-41 
numbering automatically with a 
macro, GEN 5-75E 
specifying beginning number, 
GEN 5-32E 
specifying unnumbered, GEN 
5-32E 
text formatting commands for, 
GEN 5-14E 
sections option (ex) 
description, GEN 3-100 
Security 
dial-up network and, SYS 5-125 
UNIX and, SYS 4-3 to 4-5 
uucp system and, SYS 5-138 
sed stream editor 
address types, GEN 3-107 to 
3-108 
command line format, GEN 
3-105E 
defined, GEN 2-13, 3-52 
description, GEN 3-105 to 3-114 
ed and, GEN 3-105 
functions, GEN 3-108 to 3-114 
operation, GEN 3-105 to 3-106 
taking commands from a file, 
GEN 3-52E 
uses, GEN 3-105 
seek function 
See also lseek 
description, PGM 1-12 


select system call 
4.2BSD improvement, SYS 1-12 
multiplexing I/O requests, SYS 
3-11E 
Semicolon character (ed) 
compared with comma, GEN 3-45 
setting dot, GEN 3-45 to 3-46 
send system call 
4.2BSD improvement, SYS 1-12 
transferring data, SYS 3-9E 
sendbug program 
See also bugfiler program 
submitting 4.2BSD bug reports, 
SYS 1-8 
sendmail 
installation and operation guide, 
SYS 2-27 to 2-60 
Sendmail Installation and Operation 
Guide, SYS 2-27 to 2-60 
See also sendmail 
sendmail option (Mail) 
defined, GEN 2-35 
sendmail program 
See also mailaddr 
See also sendmail option 
See also syslog server program 
4.2BSD improvement, SYS 1-4, 
1-21 
implementing aliases, GEN 2-21 
sendmsg system call 
See also recvmsg system call 
4.2BSD improvement, SYS 1-12 
sendto primitive 
sending data, SYS 3-10E 
sendto system call 
4.2BSD improvement, SYS 1-12 
Sentence 
editing with vi, GEN 3-61 
vi definition, GEN 3-61 
Sequenced packet socket 
(unsupported) 
defined, SYS 3-6 
Server process 
See also Client process 
description, SYS 3-17 
Service name 
represented by the servent 
structure, SYS 3-14 
Service process 
See also Service server 
Service server 
See also Xerox Courier protocol 
_ description, SYS 3-17 


services database 
4.2BSD improvement, SYS 1-17 
set command (C shell) 
C shell variables and, GEN 4-40E 
defined, GEN 4-72 
set command (ex) 
description, GEN 3-92 
set command (Mail) 
See also unset command (Mail) 
forms of, GEN 2-20 
options and, GEN 2-32 
restriction, GEN 2-21 
Set terminal options command 
See stty command (C shell) 
Set-GID bit 
description, SYS 4-4 
security and, SYS 4-5 
Set-UID bit 
description, SYS 4-4 
security and, SYS 4-5 
setbuf library routine 
See also setbuffer library routine 
setbuffer library routine 
See also setbuf library routine 
4.2BSD improvement, SYS 1-14 
setenv command (C shell) 
See also printenv command (C 
shell) 
defined, GEN 4-73 
setting variables in environment, 
GEN 4-51E 
setgid system call 
See setregid system call 
Sethi-Ullman algorithm 
C compiler and, PGM 2-69 to 
2-70 
setifaddr program 
4.2BSD improvement, SYS 1-21 
setlinebuf library routine 
4.2BSD improvement, SYS 1-14 
setquota system call 
4.2BSD improvement, SYS 1-12 
SETREG operator (C compiler) 
defined, PGM 2-65 
setregid system call 
4.2BSD improvement, SYS 1-12 
setreuid system call 
4.2BSD improvement, SYS 1-12 
setterm routine 
defined, PGM 4-88 
setuid system call 
See setreuid system call 
SFCON operator (C compiler) 
defined, PGM 2-66 
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SG command (ms) 
specifying signature line, GEN 5-9 
sh command (ex) 
description, GEN 3-92 
sh command (me) 
See also uh command (me) 
defined, GEN 5-40 
numbering section heads, GEN 
5-31 to 5-32 
SH command (ms) 
specifying unnumbered section 
head, GEN 5-6 
sh program 
See Bourne shell 
Shared lock 
multiple processes and, SYS 1-3 
Sharp character 
printing, GEN 3-39 
Sharp character (#) 
entering in text, GEN 2-4 
erasing last character typed, GEN 
2-4 
shell comments and, GEN 4-57 
Shell 
See also C shell 
See Bourne shell 
defined, GEN 4-73 
description, GEN 1-27 to 1-31 
implementing, GEN 1-29 
shell command (ex) 
See sh command (ex) 
shell command (Mail) 
See also SHELL option 
description, GEN 2-32 
executing Shell command from 
Mail, GEN 2-22 
shell option (ex) 
description, GEN 3-100 
SHELL option (Mail) 
defined, GEN 2-33 
setting, GEN 2-32 
specifying, GEN 2-20 
Shell procedure 
debugging, GEN 4-15 
defined, GEN 4-7 
description, GEN 4-7 to 4-16 
Shell program 
definition, GEN 2-11 
description, GEN 2-11 to 2-12 
escaping to from Mail, GEN 2-25 
profile file and, GEN 2-12 
programming aids, GEN 2-14 
as programming language, GEN 
2-14 


Index-58 


Shell program (Cont.) 
reading a file for commands, GEN 
2-12 
specifying for Mail, GEN 2-20 
Shell script 
See Script file 
shiftwidth option (ex) 
description, GEN 3-100 
Shoens, K., & Leres, C. 
Mail Reference Manual, GEN 
2-17 to 2-41 
showmatch option (ex) 
description, GEN 3-100 
showmatch option (vi) 
lisp and, GEN 3-68 
shutdown system call 
4.2BSD improvement, SYS 1-12 
data pending and, SYS 3-10E 
sigblock system call 
4.2BSD improvement, SYS 1-12 
SIGCHLD signal 
constructing server processes, SYS 
3-27 
reaping child processes, SYS 
3-28E 
SIGIO signal 
4.2BSD improvement, SYS 1-13, 
5-7 
interrupt-drive I/O and, SYS 3-27 
Signal 
defined, GEN 4-73 
description, PGM 1-17 to 1-20 
handling methods, GEN 4-22 
Signal facilities 
4.2BSD improvement, SYS 1-3 
signal function 
descripton, PGM 1-17 to 1-20 
signal.h file 
4.2BSD improvement, SYS 5-7 
signals and, PGM 1-17 
Signataure line 
specifying, GEN 5-9 
sigpause system call 
4.2BSD improvement, SYS 1-12 
SIGPROF signal 
4.2BSD improvement, SYS 1-18, 
5-7 
sigsetmask system call 
4.2BSD improvement, SYS 1-12 
sigstack system call 
4.2BSD improvement, SYS 1-12 
sigsys system call 
See signal facilities 


SIGTINT signal 
See SIGIO signal 
SIGURG signal 
4.2BSD improvement, SYS 1-13, 
5-7 
out of band data and, SYS 3-27 
sigvec system call 
4.2BSD improvement, SYS 1-13 
SIGVTALRM signal 
4.2BSD improvement, SYS 1-18, 
5-7 
sinclude command (M4) 
description, PGM 2-396 
SINCR parameter 
description, SYS 5-121 
Singlespacing 
specifying, GEN 5-23 
size keyword (EQN) 
changing point size, GEN 5-100 
sk command (me) 
defined, GEN 5-44 
Sklower, K.L., & others 
Franz Lisp Manual, The, PGM 
2-211 to 2-358 
Slash 
See Backslash 
Slow terminal 
editing on, GEN 3-64 
vi and, GEN 3-74 
slowopen option (ex) 
description, GEN 3-100 
SM command (ms) 
decreasing type size, GEN 5-8 
SMAPSIZ parameter 
description, SYS 5-122 
SMTP . 
See DARPA Simple Mail Transfer 
Protocol 
SNAME operator (C compiler) 
defined, PGM 2-65 
so command (ex) 
See so command (ex) 
description, GEN 3-92 
so command (nroff/troff) 
defined, GEN 5-72 
interpolating file name, GEN 5-81 
SO_DEBUG option 
network and, SYS 5-57 
Socket 
binding, SYS 3-7 
creating, SYS 3-7 
description, SYS 3-6 to 3-11 
discarding, SYS 3-10, 3-10E 
naming, SYS 3-6 


Socket (Cont.) 
optimal size, SYS 1-28 
process group and, SYS 3-23 
types of, SYS 3-6 
Socket name 
binding to UNIX domain socket, 
SYS 3-8E 
description, SYS 3-7 
Socket system call 
creating a socket, SYS 3-7E 
socket system call 
4.2BSD improvement, SYS 1-138 
failure, SYS 3-7 
socket.h file 
4.2BSD improvement, SYS 5-5 
socketpair system call 
4.2BSD improvement, SYS 1-13 
socketvar.h file 
4.2BSD improvement, SYS 5-5 
Soft limit | 
defined, SYS 2-3 
Software maintenance 
using network for, SYS 5-127 
SOH 
See Leader character (nroff/troff) 
sort program 
defined, GEN 2-13, 4-73 
specifying numeric sort, GEN 
4-32K 
sortbib command 
sorting bibliographic databases 
and, SYS 1-9 
Source Code Control System 
See SCCS 
source command 
description, GEN 2-32 
source command (C shell) 
defined, GEN 4-73 
effecting changes to .chshrc 
immediately, GEN 4-51 
Source file 
locating 
reference list, SYS 5-117 
Source management system 
defined, PGM 3-23 
sp command (me) 
See also bl command (me) 
entering, GEN 5-23 
sp command (nroff/troff) 
defined, GEN 5-62 
setting, GEN 5-84 
Space character 
edit and, GEN 3-7 
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Special character 
See Metacharacters 
searching, GEN 3-21 
Spell | 
defined, GEN 2-13 
detecting spelling errors, GEN 
2-13 
sprintf function 
See also fprintf function 
description, PGM 1-8 
sprintf function (awk) 
defined, PGM 3-8 
sptab table 
defined, PGM 2-68 
SQFILE 
description, SYS 5-142 
sqrt function (awk) 
defined, PGM 3-8 
sqrt keyword, GEN 2-44E 
defined, GEN 2-51 
sqrt operator (EQN) 
creating square roots, GEN 5-100 
Square root 
creating with EQN, GEN 5-100 
DC and, GEN 2-61 
Square root (BC), GEN 2-44 
ss command (troff) 
defined, GEN 5-58 
sscanf function 
description, PGM 1-8 
SSIZE parameter 
description, SYS 5-121 
SSPACE operator (C compiler) 
defined, PGM 2-64 
Stack command (DC) 
description, GEN 2-62 
Standalone I/O library 
4.2BSD improvement, SYS 5-15 
Standard error output file 
description, PGM 1-6 
Standard I/O library 
call formats, PGM 1-21 to 1-24 
defined, PGM 1-5 
description, PGM 1-5 to 1-8, 1-21 
to 1-24 
Standard input 
See Input 
typing form letters or text with 
nroff/troff, GEN 5-72 
Standard input file 
description, PGM 1-6 
Standard output 
See Output 
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Standard output file 
description, PGM 1-6 
standout routine 
defined, PGM 4-84 
Star 
See Asterisk character 
start command (Ipc) 
description, PGM 4-103 
Startup file 
running, GEN 2-12 
stat system call 
4.2BSD improvement, SYS 1-13 
stat.h file 
4.2BSD improvement, SYS 5-7 
Statement (as) 
description, GEN 6-55 to 6-56 
Statement (BC) 
See also specific statements 
description, GEN 2-54 to 2-55 
typing several on one line, GEN 
2-48 
Status 
defined, GEN 4-73 
status command (mt) 
showing state of tape drive, SYS 
1-7 
stderr file pointer 
description, PGM 1-6 
error handling and, PGM 1-7 
stdin file pointer 
description, PGM 1-6 
stdio library 
4.2BSD improvement, SYS 1-14 
stdout file pointer 
description, PGM 1-6 
stop command (C shell) 
background jobs and, GEN 4-46E 
defined, GEN 4-73 
stop command (ex) 
Berkeley TTY driver and, GEN 
3-102 
description, GEN 3-93 
stop command (Ipc) 
description, PGM 4-103 
Stopped message 
suspending jobs and, GEN 4-46 
Storage class 
description, GEN 2-53 
store command (DC) 
See s command (DC) 
Stream socket 
See also Datagram socket 
creating in Internet domain, SYS 
3-7E 


Stream socket (Cont.) 
defined, SYS 3-6 
String (C shell) 
defined, GEN 4-73 
String (nroff/troff) 
defined, GEN 5-62 
description, GEN 5-62 to 5-65 
String statement (as) 
defined, GEN 6-56 
strip 
4,.2BSD improvement, SYS 1-9 
STST file 
description, SYS 5-143 
stterm routine 
variables set by, PGM 4-89T to 
4-90T 
stty command 
DEC standard values and, SYS 
1-9 
stty command (C shell) 
background jobs and, GEN 4-48 
defined, GEN 4-73 
Style program 
See also Diction program 
description, GEN 5-163 to 5-177 
su 
4.2BSD improvement and, SYS 
1-9 
sub keyword (EQN) 
specifying subscripts, GEN 5-99 
subr__mcount.c file 
contents, SYS 5-9 
subr__prf.c file 
contents, SYS 5-9 
subr__rmap.c file 
contents, SYS 5-9 
subr__xxx.c file 
contents, SYS 5-9 
Subscript 
specifying, GEN 5-47 
Subscript (EQN) 
specifying, GEN 5-99 
Subscript (nroff/troff) 
specifying, GEN 5-68 
Subscript (troff) 
specifying, GEN 5-87E 
Subscripted variable 
defined, GEN 2-46 to 2-47 
Substitute command 
See s command 
substitute command (edit) 
See s command (edit) 
substitute command (ex) 
See s command (ex) 


substitute command (sed), GEN 
3-111E 
description, GEN 3-110 to 3-111 
special characters and, GEN 
3-110 
Substitution 
See also Expansion 
defined, GEN 4-73 
substr command (M4) 
description, PGM 2-397 
substr function (awk) 
defined, PGM 3-8 
Subtraction 
DC and, GEN 2-60 
subwin routine 
defined, PGM 4-87 
Suffix list (make), PGM 3-17 
description, PGM 3-21 
Summary information 
contents, SYS 2-8 
sup keyword (EQN) 
specifying superscripts, GEN 5-99 
Super user 
security and, SYS 4-4 
Super-block 
description, SYS 2-8 
Superscript 
specifying, GEN 5-47 
Superscript (EQN) 
specifying, GEN 5-99 
Superscript (nroff/troff) 
specifying, GEN 5-68 
Superscript (troff) 
specifying, GEN 5-87E 
Suspended job 
defined, GEN 4-73 
description, GEN 4-36 
sv command (me) 
specifying blank lines, GEN 5-44 
sv command (nroff/troff) 
defined, GEN 5-62 
Swap space configuration 
4.2BSD improvement, SYS 1-4 
swapgeneric.c file 
4.2BSD improvement, SYS 5-14 
swapon system call 
4.2BSD improvement, SYS 1-13 
SWIT operator (C compiler) 
defined, PGM 2-65 
switch command (C shell) 
defined, GEN 4-73 
exiting from, GEN 4-58 
forms of, GEN 4-58 
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sx command (me) 
defined, GEN 5-41 
Symbolic link 
description, SYS 1-8, 1-34 
Symbolic link data block 
defined, SYS 2-12 
SYMDEF operator (C compiler) 
defined, PGM 2-64 
symlink system call 
4,2BSD improvement, SYS 1-13 
Symmetric protocol 
defined, SYS 3-17 
sys directory 
file prefixes, SYS 5-8T 
sys__errno 
printing, PGM 1-12 
sys__generic.c file 
contents, SYS 5-9 
sys__inode.c file 
contents, SYS 5-9 
sys__machdep.c file 
4.2BSD improvement, SYS 5-13 
sys__process.c file 
contents, SYS 5-9 
sys__socket.c file 
contents, SYS 5-9 
syscmd command (M4) 
description, PGM 2-396 
sysline program 
maintaining terminal status, SYS 
1-9 
syslog server program 
4.2BSD improvement, SYS 1-21 
System function 
description, PGM 1-12 
System identifier 
defined, SYS 5-74 
System mailbox file 
commands for folders and, GEN 
2-23 
hold option and, GEN 2-32 
incoming mail and, GEN 2-17 
mbox and, GEN 2-20 
storing mail, GEN 2-20, 2-21 
System management 
best reference, SYS 
System process 
defined, PGM 4-5 
System time 
4.2BSD improvement, SYS 1-4 
System-wide file 
defined, GEN 2-21 
Systems Industries 9700 tape drive 
See ut.c device driver 
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systm.h file 
See also kernel.h file 
4.2BSD improvement, SYS 5-7 
sz command (me) 
changing point size, GEN 5-38W 
defined, GEN 5-44 


T 


t command (ed) 
compared with m command, GEN 
3-51 
creating a series of variable lines, 
GEN 3-51 
t command (ex) 
See copy command (ex) 
t command (sed) 
defined, GEN 3-114 
T command (vi) 
defined, GEN 3-79 
t command (vi) 
defined, GEN 3-81 
t escape (Mail) 
description, GEN 2-25 
T flag (Mail) 
defined, GEN 2-36 
t flag (make) 
defined, PGM 3-17 
T option (hunt) 
defined, GEN 5-149 
t option (hunt) 
defined, GEN 5-149 
T option (nroff) 
defined, GEN 5-50 
t option (troff) 
defined, GEN 5-50 
ta command (nroff/troff) 
defined, GEN 5-66 
Tab 
resetting, GEN 5-45 
setting multiple, GEN 5-87 
Tab character 
printing, GEN 3-37 
terminals without, GEN 2-4 
Tab character (nroff/troff) 
setting, GEN 5-66 
uninterpreted, GEN 5-66 
Tab replacement character 
See tc command (troff), GEN 
5-87 
Tab stop 
setting, GEN 3-61n 
vi and, GEN 3-61 


Table 
breaking across pages, GEN 5-10 
continuing, GEN 5-35 
entering with -ms, GEN 5-8 
floating, GEN 5-45 
formatting, GEN 2-13, 5-33 
keeping on one page, GEN 5-42 
text formatting commands for, 
GEN 5-16E 
Table of contents 
entering, GEN 5-28 
formatting, GEN 5-34F 
producing, GEN 5-18, 5-18E 
specifying multiple, GEN 5-29 
specifying section titles for, GEN 
5-41 
specifying without leadering, GEN 
5-29 
Tables 
formatting, GEN 5-115 to 5-131 
tabstop option (ex) 
description, GEN 3-100 
Tag 
defined, GEN 5-145 
tag command (ex) 
description, GEN 3-93 
Tag file 
defined, GEN 5-145 
taglength option (ex) 
description, GEN 3-100 
tags option (ex) 
3.5 changes, GEN 3-103 
description, GEN 3-100 
tail 
4.2BSD improvement, SYS 1-9 
talk program 
description, SYS 1-9 
tar program 
4.2BSD improvement, SYS 1-9, 
1-17 
tbl program 
description, GEN 5-38, 5-115 to 
5-131 
formatting tables, GEN 2-13 
te command (nroff/troff) 
defined, GEN 5-66 
te command (troff) 
replacing tab character, GEN 5-87 
TCP program 
See trpt program 
teachgammon program 
4,.2BSD improvement, SYS 1-17 


Technical memorandum 
text formatting commands for, 
GEN 5-13E 
Tektronix 4025 terminal 
command character for, GEN 3-76 
Tektronix 4027 terminal 
command character for, GEN 3-76 
telnet program 
ARPA Telnet protocol and, SYS 
1-9 
telnetd server program 
login file and, SYS 1-7 
4.2BSD improvement, SYS 1-21 
term option (ex) 
description, GEN 3-101 
Terminal 
See also Hardcopy terminal 
See also Pseudo terminal 
See also Screen (Screen package) 
See also Screen package 
See also Slow terminal 
See also Uppercase terminal 
configuring, SYS 5-42 
programs changing mode of, GEN 
4-48 
replacing with a file, GEN 2-10 
specifying output type with nroff, 
GEN 5-50 
specifying standard output with 
troff, GEN 5-50 
specifying type, GEN 3-54E 
strange behavior, GEN 2-4 
supported 
reference list, GEN 2-3 
switch settings, GEN 2-3 
type codes, GEN 3-53T 
without tabs, GEN 2-4 
Terminal screen 
defined, PGM 4-75 
Termination 
defined, GEN 4-73 
terse option (ex) 
description, GEN 3-101 
test command 
Bourne shell and, GEN 4-12 
Text editor 
See ed editor 
defined, GEN 3-3, 3-25 
See also Edit editor, GEN 3-3 
Text Formatting 
See also nroff/troff text processor 
Text input mode (ex) 
defined, GEN 3-85 
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Text segment (as) 
description, GEN 6-54 
text statement 
defined, GEN 6-59 
tftpd server program 
4.2BSD improvement, SYS 1-21 
TH command (me) 
continuing a table, GEN 5-35E 
th command (me) 
defined, GEN 5-45 
formatting a thesis, GEN 5-33 
then command (C shell) 
See also else command (C shell) 
See also if/endif commands (C 
shell) 
defined, GEN 4-73 
Thesis 
formatting, GEN 5-18, 5-33, 5-45 
text formatting commands for, 
GEN 5-13E 
Thompson, K. 
UNIX implementation, PGM 4-5 
to 4-14 
Thompson, K., & Morris, R. 
password system, SYS 4-7 to 4-12 
Thompson, K., & Ritchie, D.M. 
implementation of file system and 
user command interface, GEN 
1-19 to 1-34 
ti command (me) 
entering, GEN 5-24 
ti command (nroff/troff) 
defined, GEN 5-62 
ems and, GEN 5-86 
Tilde character (C shell) 
accessing files from other 
directories, GEN 4-34 
Tilde character (me) 
See Metacharacters 
Tilde escape (Mail) 
defined, GEN 2-24 
description, GEN 2-24 to 2-26 
lines beginning with, GEN 2-26 
printing summary of, GEN 2-26 
reference list, GEN 2-40T 
time command (C shell) 
defined, GEN 4-74 
timing a command, GEN 4-52E 
time.h file 
4,.2BSD improvement, SYS 5-7 
timeout option (ex) 
description, GEN 3-102 
TIMEZONE parameter 
description, SYS 5-122 
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timezone parameter (config) 
defined, SYS 5-79 
tip program 
cu program as front end, SYS 1-5 
description, SYS 1-4, 1-9 
Title page 
formatting informal, GEN 5-46 
specifying, GEN 5-82, 5-45 
TL command (ms) 
AE command and, GEN 5-6 
tl command (nroff/troff) 
defined, GEN 5-70 
tl command (troff) 
printing page numbers, GEN 
5-91E 
tm command (nroff/troff) 
defined, GEN 5-73 
TM file 
description, SYS 5-142 
T™ macro 
description, GEN 5-18 
tm.c device driver 
4.2BSD improvement, SYS 5-12 
to keyword (EQN), GEN 5-100E 
Token 
defined, GEN 2-50 
top command (Mail) 
See also toplines option 
abbreviating, GEN 2-32 
description, GEN 2-32 
toplines option (Mail) 
defined, GEN 2-35 
setting, GEN 2-32E 
topq command (Ipc) 
description, PGM 4-103 
touchwin routine 
defined, PGM 4-87 
Toy, M.C., & Arnold, K.C.R.C. 
guide to the dungeons of doom, 
GEN 6-17 to 6-25 
tp command (me) 
defined, GEN 5-45 
specifying a title page, GEN 5-32 
specifying title page, GEN 5-33E 
tr command (nroff/troff) 
defined, GEN 2-13, 5-67 
using, GEN 2-13E 
transfer command 
See t command (ed) 
translit command (M4) 
description, PGM 2-397 
Transparent throughput (nroff/troff) 
specifying, GEN 5-67 


Trap 
description, GEN 1-31 
trap command (Bourne shell) 
fault handling, GEN 4-21 to 4-23 
trap.c file 
4.2BSD improvement, SYS 5-14 
trek game 
4.2BSD improvement, SYS 1-17 
troff text processor 
See also EQN program 
See also ms macro package 
See also nroff text processor 
See also nroff/troff text processor 
See also tbl program 
defined, GEN 2-12, 5-83 
defining macros, GEN 5-89 to 
5-90 
defining strings, GEN 5-88, 5-89 
device resolution and, GEN 5-56 
drawing horizontal and vertical 
lines of characters, GEN 5-88 
entering arithmetic expressions, 
GEN 5-92 
entering commands, GEN 5-83 
environments, GEN 5-94 
formatting a document with -ms, 
GEN 2-12 
indenting lines, GEN 5-86 
invoking, GEN 5-49 
moving characters up and down, 
GEN 5-87 
moving text backwards on a line, 
GEN 5-87 
setting point sizes, GEN 5-84 
setting tabs, GEN 5-86 
setting vertical spacing, GEN 5-84 
specifying cut mark, GEN 5-74 
specifying fonts, GEN 5-85 
specifying fonts on the typesetter, 
GEN 5-86 
specifying metacharacters, GEN 
5-86 
“specifying page heading, GEN 
5-90 
specifying unpaddable characters, 
GEN 5-88 
stopping phototypesetter to reload, 
GEN 5-49 
tutorial, GEN 5-83 to 5-96 
trpt program 
4.2BSD improvement, SYS 1-21 
truncate system call 
4.2BSD improvement, SYS 1-13 


TS command (me) 

continuing tables, GEN 5-35 

defined, GEN 5-45 

formatting tables, GEN 5-35 
ts driver 

4.2BSD improvement, SYS 1-16 
ts.c device driver 

4.2BSD improvement, SYS 5-13 
tset command (C shell) 

defined, GEN 4-74 

using, GEN 4-30E 
tstp routine 

defined, PGM 4-88 
tty 

See also ttydev.h file 

handling, SYS 5-6 
tty character 

See also ttychars.h file 

handling, SYS 5-5 
tty command (C shell) 

defined, GEN 4-74 
tty.c file 

4.2BSD improvement, SYS 5-9 
tty.h file 

4.2BSD improvement, SYS 5-7 
tty__bk.c file 

obsolete, SYS 5-9 
tty__conf.c file 

contents, SYS 5-9 
tty__pty.c file 

4.2BSD improvement, SYS 5-9 
tty__subr.c file 

contents, SYS 5-9 
tty__tb.c file 

contents, SYS 5-9 
tty__tty.c file 

contents, SYS 5-9 
ttychars.h file 

4.2BSD improvement, SYS 5-5 
ttydev.h file 

4.2BSD improvement, SYS 5-6 
tu driver 

4.2BSD improvement, SYS 1-16 
tu.c file 

4.2BSD improvement, SYS 5-14 
TU58 cartridge tape cassette 

See uu driver 

See uu.c device driver 
TU80 tape drive 

See ts driver 
tunefs program 

4.2BSD improvement, SYS 1-21 
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Tuthill, B. 
-ms revised version, GEN 5-17 to 
5-19 
using refer, GEN 5-138 to 5-142 
Twinkle program 
description, PGM 4-92E 
motion optimization and, PGM 
4-97E 
Two-column output 
See Column 
type command (Mail) 
See print command (Mail) 
abbreviating, GEN 2-18 
description, GEN 2-32 
reading mail and, GEN 2-18 to 
2-19 
Type-number (refer) 
reference list, GEN 5-152 
Typesetting Mathematics - User’s 
Guide, GEN 5-105 to 5-114 
Typing 
correcting mistakes, GEN 2-4 
Typo 
defined, GEN 2-13 
detecting spelling errors, GEN 
2-13 


U 


u command (ed) 
using, GEN 3-38 
u command (edit) 
See also At sign 
See also CTRL-H 
description, GEN 3-16 
recovering files, GEN 3-23 
u command (ex) 
description, GEN 3-93 
u command (me) 
defined, GEN 5-44 
u command (troff) 
specifying superscripts and 
subscripts, GEN 5-87 
U command (vi) 
defined, GEN 3-79 
u command (vi) 
defined, GEN 3-81 
u flag (Mail) 
defined, GEN 2-36 
u option (uulog) 
defined, SYS 5-137 
uba.c device driver 
4.2BSD improvement, SYS 5-13 
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uba__ctrl structure 
description, SYS 5-93 
uba__device structure 
description, SYS 5-94 
uba__driver structure 
description, SYS 5-90 
ud__addr routine 
description, SYS 5-93 
ud__attach routine 
description, SYS 5-92 
ud__dgo routine 
description, SYS 5-93 
ud__dinfo routine 
description, SYS 5-93 
ud__dname routine 
description, SYS 5-93 
ud__minfo routine 
description, SYS 5-93 
ud__mname routine 
description, SYS 5-93 
ud__probe routine 
description, SYS 5-91 
ud__slave routine 
description, SYS 5-91 
ud__xclu routine 
description, SYS 5-93 
uda driver 
4.2BSD improvement, SYS 1-16 
uda.c device driver 
4.2BSD improvement, SYS 5-13 
uf command (nroff/troff) 
defined, GEN 5-67 
ufs__alloc.c file 
contents, SYS 5-9 
ufs__bio.c file 
contents, SYS 5-10 
ufs_bmap.c file 
contents, SYS 5-10 
ufs__dsort.c file 
contents, SYS 5-10 
ufs__fio.c file 
contents, SYS 5-10 
ufs_inode.c file 
contents, SYS 5-10 
ufs__machdep.c file 
4.2BSD improvement, SYS 5-13 
ufs__mount.c file 
contents, SYS 5-10 
ufs_nami.c file 
contents, SYS 5-10 
ufs__subr.c file 
contents, SYS 5-10 
ufs__syscalls.c file 
contents, SYS 5-10 


ufs__tables.c file 
contents, SYS 5-10 
ufs__xxx.c file 
contents, SYS 5-10 
uh command (me) 
defined, GEN 5-41 
specifying unnumbered section 
heads, GEN 5-32E 
ui__addr routine 
description, SYS 5-95 
ui_alive routine 
description, SYS 5-95 
ui_ctlr routine 
description, SYS 5-94 
ui_dk routine 
description, SYS 5-95 
ui__driver routine 
description, SYS 5-94 
ui_flags routine 
description, SYS 5-95 
ui_hd routine 
description, SYS 5-95 
ui_intr routine 
description, SYS 5-95 
ui_mi routine _ 
description, SYS 5-95 
ui__physaddr routine 
description, SYS 5-95 
ui__slave routine 
’ description, SYS 5-94 
ui__type routine 
description, SYS 5-95 
ui_ubanum routine 
description, SYS 5-94 
ui_unit routine 
description, SYS 5-94 
UID 
description, GEN 1-22, SYS 4-4 
uio.h file 
4.2BSD improvement, SYS 5-6 
uipe_domain.c file 
contents, SYS 5-10 
uipc__mbuf.c file 
contents, SYS 5-10 
uipc__pipe.c file 
contents, SYS 5-10 
uipe__proto.c file 
contents, SYS 5-10 
uipe__socket.c file 
contents, SYS 5-10 
uipc__socket2.c file 
contents, SYS 5-10 
uipce__syscalls.c file 
contents, SYS 5-10 


uipe__usrreq.c file 
contents, SYS 5-10 
ul command 
4.2BSD improvement, SYS 1-9 
ul command (me) 
See also u command (me) 
entering, GEN 5-25 
- troff and, GEN 5-36 
UL command (ms) 
underlining a word, GEN 5-8 
ul command (nroff/troff) 
defined, GEN 5-67 
ul command (troff) 
specifying italic lines, GEN 5-86 
ULTRIX-32 
See also UNIX 
ULTRIX-32 Operating System 
getting started, GEN 2-1 to 2-64 
um_cmd routine 
description, SYS 5-94 
um__ctr]l routine 
description, SYS 5-94 
um_adriver routine 
description, SYS 5-94 
um__hd routine 
description, SYS 5-94 
um_intr routine 
description, SYS 5-94 
um__tab routine 
description, SYS 5-94 
um_ubinfo routine 
description, SYS 5-94 
Umlat 
See Metacharacters 
un network interface driver 
4.2BSD improvement, SYS 1-16 
un.h file 
4.2BSD improvement, SYS 5-6 
una command (ex) 
See also abcommand (ex) 
description, GEN 3-93 
unabbreviate command (ex) 
See una command (ex) 
unalias command (C shell) 
See also alias command (C shell) 
defined, GEN 4-74 
Unary operator 
defined, GEN 2-52 
Unary operator (C compiler) 
description, PGM 2-66. 
unctrl routine 
defined, PGM 4-87 
undelete command (Mail) 
See also delete command (Mail) 
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undelete command (Mail) (Cont.) 
abbreviating, GEN 2-33 
description, GEN 2-33 
Underlining 
See also Italic 
nroff and, GEN 5-66 
on the typesetter, GEN 5-8 
specifying, GEN 5-8, 5-25 
technique for, GEN 3-42 
Undo command 
See u command 
undo command (edit) 
See u command (edit) 
undo command (ex) 
See u command (ex) 
Ungermann-Bass network interface 
unit 
See un network interface driver 
ungetc function 
description, PGM 1-8 
UNIBUS 
device naming, SYS 5-20 
UNIBUS device driver 
support routines, SYS 5-95 
univec.c file 
installing device driver and, SYS 
5-119 
UNIX Assembler Reference Manual, 
GEN 6-53 to 6-64 
See also as assembler 
UNIX Operating System 
See also 4.2BSD 
See also ULTRIX-32 
See also VAX UNIX system 
bootstrapping and 4.2BSD, SYS 
5-15 
building process, SYS 5-76 to 
5-78 
building with config, SYS 5-73 to 
5-105 
changes in 4.2BSD, SYS 1-3 to 
1-21 
computer-aided instruction for, 
GEN 6-3 to 6-16 
crashing, SYS 4-3 
defined, GEN 3-3 
design considerations, GEN 1-31 
device naming, SYS 5-19 
distinguishing block and raw 
devices, SYS 5-20 
for beginners, GEN 2-3 to 2-16 
getting started, GEN 6-15 to 6-16 
hardware environment, GEN 1-20 
implementation, PGM 4-5 to 4-14 
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UNIX Operating System (Cont.) 
introduction, GEN 1-19 to 1-20 
managing 

See SYS 
other operating systems and, 
PGM 4-13 
programming, PGM 1-3 to 1-24 
reading list, GEN 2-15 
software environment, GEN 1-20 
UNIX Programmer’s Manual 
accessing on line, GEN 2-5 
UNIX/32V Operating System 
hardware requirements, GEN 1-4 
highlights, GEN 1-3 to 1-18 
recreating, SYS 5-119 
regenerating system software, SYS 
5-117 to 5-122 

setting up V1.0, SYS 5-107 to 
5-115 

tuning, SYS 5-121 to 5-122 

UNIX/32V Programmer’s Manual 
online, GEN 1-11 

unlink function 
description, PGM 1-11 

unlink system call 
See mkdir command 

unmap command (ex) 

See also map command (ex) 
description, GEN 3-93 

unoptim routine (C shell) 

See also optim routine (C shell) 
description, PGM 2-67 to 2-68 

Unpaddable space character 
- (nroff/troff) 
defined, GEN 5-60, 5-88 
specifying for digits, GEN 5-88 
specifying for spaces, GEN 5-88 

unpcb.h file 
4.2BSD improvement, SYS 5-6 

unset command (C shell) 
defined, GEN 4-74 

unset command (Mail) © 
See also set command (Mail) 
description, GEN 2-33 

until statement (C shell) 

See also while statement (C shell) 
description, GEN 4-13 

up driver 
4.2BSD improvement, SYS 1-16 

up.c device driver 
4.2BSD improvement, SYS 5-13 

Uppercase terminal 
vi and 


User ID 
See UID 
User Identification Number 
See UID 
User identification number 
See UID 
User process 
defined, PGM 4-5 
user.h file 
4.2BSD improvement, SYS 5-7 
USERFILE 
defined, SYS 5-140 
USR directory 
block size, SYS 5-40 
description, GEN 2-9 
rebuilding, SYS 5-32 
setting up, SYS 5-28 
ut.c device driver 
4,.2BSD improvement, SYS 5-12 
utime system call 
See utimes system call 
utimes system call 
4,2BSD improvement, SYS 1-13 
utmp file 
See also wtmp file 
4,.2BSD improvement, SYS 1-17 
uu driver 
4,.2BSD improvement, SYS 1-16 
uu.c device driver 
4.2BSD improvement, SYS 5-12 
uucico program 
defined, SYS 5-131 
description, SYS 5-124, 5-134 to 
5-137 
functions, SYS 5-125 
starting, SYS 5-125, 5-134 
starting with shell file, SYS 5-143 
uuclean program 
defined, SYS 5-131 
- description, SYS 5-137 
uucp command 
command line format, SYS 5-131 
defined, SYS 5-125 
description, SYS 5-131 to 5-133 
transferring files between 
machines, SYS 5-132E 
UUCP network 
ARPANET and, GEN 2-26 
uucp program 
defined, SYS 5-131 
uucp system 
4.2BSD improvement, SYS 1-4, 
1-9, 5-45 


uucp system (Cont.) 
administration, SYS 5-142 to 
5-144 
defined, SYS 5-131 
directory list, SYS 5-45 
file list, SYS 5-45 to 5-46 
implementing, SYS 5-131 to 5-144 
installing, SYS 5-138 to 5-142 
login entry and, SYS 5-144 
security and, SYS 5-138 
setting up, SYS 5-45 to 5-46 
uucp.h file 
modifying for uucp, SYS 5-138 
uulog program 
defined, SYS 5-131 
description, SYS 5-137 
uusnap program 
description, SYS 1-9 
uux command 
command line format, SYS 5-133 
defined, SYS 5-125 
description, SYS 5-133 to 5-134 
providing remote output, SYS 
5-127 
uux program 
defined, SYS 5-131 
uuxqt program 
defined, SYS 5-131 
description, SYS 5-137 


V 


v command (DC) 
descripton, GEN 2-58 
v command (ed) 
defined, GEN 3-34 
specifying line numbers, GEN 
3-47 
specifying lines without text 
patterns, GEN 3-46 to 3-47 
using, GEN 3-33 
v command (troff) 
creating decorative initial capital, 
GEN 5-87E 
moving characters up and down, 
GEN 5-87 
specifying vertical motion, GEN 
5-68 
v escape (Mail) 
description, GEN 2-24 
v flag (Mail) 
See also verbose option 
defined, GEN 2-36 
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v option (inv) 
defined, GEN 5-148 
va driver 
4.2BSD improvement, SYS 1-16 
va.c file 
4.2BSD improvement, SYS 5-13 
Valued option (Mail) 
See also Option (Mail) 
defined, GEN 2-20 
Variable (BC) 
declaring automatic, GEN 2-46 
number permitted, GEN 2-45 
Variable (Bourne shell) 
description, GEN 4-10 to 4-12 
reference list, GEN 4-11 
Variable (C shell) 
accessing components, GEN 4-54 
checking for assigned value, GEN 
4-53 
defined, GEN 4-74 
removing definition from shell, 
GEN 4-52 
removing from environment, GEN 
4-52 
Variable (Screen package) 
reference list, PGM 4-77 
Variable expansion 
See Expansion 
See Variable 
. Variable substitution 
description, GEN 4-53 
VAX UNIX system 
accounting, SYS 5-56 
booting, SYS 5-52 
booting for single user, SYS 5-52 
changing from single user to 
multiuser status, SYS 5-52 
changing to multiuser from single 
user status, SYS 5-52 
checking file system, SYS 5-53 
file maintenance list, SYS 5-57 
monitoring system performance, 
SYS 5-54 
operating procedures, SYS 5-52 
regenerating, SYS 5-55 
resource control, SYS 5-56 
tracking changes, SYS 5-56 
VAX-11/750 
configuration file, SYS 5-85 
VAX-11/750 console cassette 
interface 
See tu driver 
VAX-11/780 
configuration file, SYS 5-84 
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VAX/VMS Operating System 
autoconfiguration, SYS 5-89 to 
5-95 
data structure sizing rules, SYS 
5-103 to 5-105 
VAX/VMS system sources 
directory list, SYS 5-4 
ve command (ex) 
description, GEN 3-94 
verbose option (Mail) 
See also -v flag 
defined, GEN 2-35 
verbose variable (C shell) 
defined, GEN 4-74 
Version 
suppressing for Mail, GEN 2-35 
version command (ex) 
See ve command ex) 
Vertical bar (EQN) 
typesetting in proper size, GEN 
5-100E 
Vertical spacing 
setting with troff, GEN 5-84 
Vesterman, W., & Cherry, L.L. 
style and diction programs, GEN 
5-163 to 5-177 
vfontinfo program 
font information and, SYS 1-9 
vfork system call 
future plans, SYS 1-13 
vgrind 
4.2BSD improvement, SYS 1-9 
vgrindefs file 
4.2BSD improvement, SYS 1-17 
vi command (ex) 
See also open option 
3.5 changes, GEN 3-102 
description, GEN 3-94 
screen editing and, GEN 3-85 
vi screen editor 
4,2BSD improvement, SYS 1-9 
changing words, GEN 3-60 
character editing, GEN 3-59 
character editing, low level, GEN 
3-61 
character functions, GEN 3-75T 
characters for making corrections 
in input mode, GEN 3-72T 
commands for file manipulation, 
GEN 3-71T 
deleting lines, GEN 3-60 
deleting words, GEN 3-59 
description, GEN 3-53 to 3-82 


vi screen editor (Cont.) 
determining state of file. GEN 
3-57 
editing programs, GEN 3-67 
ending a session, GEN 3-55 
ex 3.5 changes and, GEN 3-103 to 
3-104 
ex and, GEN 3-73 
executing shell command from, 
GEN 3-63 
ignoring case, GEN 3-72 
inserting text, GEN 3-58 
invoking, GEN 3-54E 
line editing, GEN 3-60 
manipulating files, GEN 3-70 
marking return points, GEN 3-64 
moving blocks of text, GEN 3-62 
moving in the file, GEN 3-56 to 
3-58 
moving on the screen, GEN 3-57 
moving to previous position, GEN 
3-57 
moving within a line, GEN 3-57 
option list, GEN 3-65 
presenting lines, GEN 3-69 
recovering lost files, GEN 3-66 
recovering lost lines, GEN 3-66 
reversing your changes, GEN 3-60 
saving changes automatically, 
GEN 3-63 
searching for strings in text, GEN 
3-56, 3-71 
sentences and, GEN 3-61 
view command (ex) 
description, GEN 3-102 
view command (vi) 
reading a file, GEN 3-58 
vipw program 
4.2BSD improvement, SYS 1-21 
vipw script 
See vipw program 
visual command (ex) 
See vi command (ex) 
visual command (Mail) 
See also edit command (Mail) 
description, GEN 2-33 
VISUAL option (Mail) 
defined, GEN 2-33 
setting, GEN 2-33 
specifying an editor, GEN 2-24 
vlimit system call 
See getrlimit system call 
vip program 
printing lisp programs, SYS 1-9 


vm__machdep.c file 

4.2BSD improvement, SYS 5-13 
vm__mem.c file 

contents, SYS 5-11 
vm_mon.c file 

contents, SYS 5-11 
vm__page.c file 

4.2BSD improvement, SYS 5-11 
vm_proc.c file 

contents, SYS 5-11 
vm__pt.c file 

contents, SYS 5-11 
vm_sched.c file 

contents, SYS 5-11 
vm__subr.c file 

contents, SYS 5-11 
vm_sw.c file 

contents, SYS 5-11 
vm__swap.c file 

contents, SYS 5-11 
vm__swp.c file 

contents, SYS 5-11 
vm_text.c file 

contents, SYS 5-11 
vmmac.h file 

4.2BSD improvement, SYS 5-7 
vmparam.h file 

4.2BSD improvement, SYS 5-7, 

5-13 

vmstat program 

4,.2BSD improvement, SYS 1-9 

monitoring system activity, SYS 

5-54 

vmsystm.h file 

4.2BSD improvement, SYS 5-7 
vpr program . 

shell scripts and, SYS 1-10 
vread system call 

obsolete, SYS 1-13 
vs command (nroff/troff) 

defined, GEN 5-61 

setting, GEN 5-84 
vswapon system call 

See swapon system call 
vtimes system call 

See getrusage system call 
vv network interface driver 

4.2BSD improvement, SYS 1-16 
vwidth program 

troff width tables and, SYS 1-10 
vwrite system call 

obsolete, SYS 1-13 
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Ww 


w command (ed) 
defined, GEN 3-34 
e command and, GEN 3-27 
entering text into a file, GEN 2-6 
saving lines for input, GEN 3-50 
using, GEN 3-26 
w command (edit) 
description, GEN 3-22 
u command and, GEN 3-16 
using, GEN 3-8 
w command (ex) 
See also wq command (ex) 
description, GEN 3-94 
w command (nroff/troff) 
description, GEN 5-68 
w command (sed) 
defined, GEN 3-111 
W command (vi) 
defined, GEN 3-80 
w command (vi) 
defined, GEN 3-81 
w escape (Mail) 
description, GEN 2-24 
w flag (mkey) 
specifying a file, GEN 5-147 
w flag (sed) 
defined, GEN 3-110 
w option (troff) 
defined, GEN 5-50 
wait function 
description, PGM 1-14 
wait system call 
See also wait.h file 
4,.2BSD improvement, SYS 1-14 
wait.h file 
4,2BSD improvement, SYS 5-6 
wait3 system call 
See also wait.h file 
4.2BSD improvement, SYS 1-14 
warn option (ex) 
description, GEN 3-101 
Wasley, D.L. 
introduction to f77 I/O library, 
PGM 2-79 to 2-88 
we command (C shell) 
4.2 BSD improvements, SYS 1-10 
defined, GEN 2-13, 4-74 
printing a list of files and, GEN 
2-11 
WDATA operator (C compiler) 
defined, PGM 2-64 
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Weinberger, P.J., & Feldman, S.I. 
Fortran.77 compiler, PGM 2-89 to 
2-109 
Weinberger, P.J., & others 
awk programming language, PGM 
3-5 to 3-12 
wh command (nroff/troff) 
defined, GEN 5-65 
whereis 
4.2BSD improvement, SYS 1-10 
which 
4.2BSD improvement, SYS 1-10 
while statement (awk) 
defined, PGM 3-9 
while statement (BC), GEN 2-47 
forming, GEN 2-54 
writing, GEN 2-47 
while statement (C shell) 
See also until statement (C shell) 
defined, GEN 4-74 
description, GEN 4-12 to 4-13 
exiting, GEN 4-58 
form of, GEN 4-12E 
forms of, GEN 4-58 
who command 
4.2BSD improvement, SYS 1-10 
printing list of people logged on, 
GEN 2-11E 
using, GEN 2-4 
Width command (nroff/troff) 
See w command (nroff/troff) 
winch routine 
defined, PGM 4-86 
Window 
defined, PGM 4-75 
description, PGM 4-76 
moving, GEN 2-33 
window option (ex) 
description, GEN 3-101 
window option (Mail) 
headers command and, GEN 2-30 
WINDOW structure 
defined, PGM 4-91E 
description, PGM 4-76 
Word (C shell) 
defined, GEN 4-74 
Word (nroff/troff) 
defined, GEN 5-60 
Word abbreviation 
See also Macro (vi) 
description, GEN 3-69 
Word list 
specifying for hyphenation, GEN 
5-69 


Work file 
defined, SYS 5-132 
Working directory 
changing, GEN 4-48 
changing background job to 
foreground job and, GEN 4-50 
changing with programs, GEN 
4-50 
defined, GEN 4-74 
description, GEN 4-48 to 4-50 
wq command (ex) 
See also xit command (ex) 
description, GEN 3-94 
wrapmargin option (ex) 
3.5 changes, GEN 3-102 
description, GEN 3-101 
wrapscan option (ex) 
description, GEN 3-101 
write command (C shell) 
defined, GEN 4-74 
write command (ed) 
See w command (ed) 
write command (edit) 
See w command (edit) 
write command (ex) 
See w command (ex) 
write command (Mail) 
See also save command (Mail) 
description, GEN 2-33 
write function 
description, PGM 1-9 
write system call 
4.2BSD improvement, SYS 1-14 
writeany option (ex) 
description, GEN 3-101 
writev system call 
4.2BSD improvement, SYS 1-14 
wtmp file 
See also utmp file 
4.2BSD improvement, SYS 1-17 


X 


x command (Mail) 
exiting Mail, GEN 2-22 
x command (me) 
defined, GEN 5-43 
entering, GEN 5-29 
X command (sed) 
defined, GEN 3-113 
X command (vi) 
defined, GEN 3-80 
x command (vi) 
defined, GEN 3-81 


x option (uucico) 
defined, SYS 5-135 
x option (uuclean) 
defined, SYS 5-138 
x option (uucp) 
defined, SYS 5-132 
x option (uux) 
description, SYS 5-133 
Xerox Courier protocol 
description, SYS 3-17 
Xerox experimental Ethernet 
controller 
See en network interface driver 
Xerox NS Sequenced Packet 
protocol 
sequenced packet socket and, SYS 
3-6 
Xerox Routing Information Protocol 
See routed program 
xit command (ex) 
See also wq command (ex) 
description, GEN 3-94 
xl command (me) 
defined, GEN 5-45 
xp command (me) 
defined, GEN 5-43 
XP macro 
description, GEN 5-18 
XS macro 
description, GEN 5-18 
xtr script file 
running, SYS 5-26E 


Y 


Y command (vi) 
defined, GEN 3-80 
using, GEN 3-62 
y operator 
See also Y command (vi) 
moving blocks of text, GEN 3-62 
ya command (ex) 
description, GEN 3-95 
Yacc 
See also Lex program generator 
description, PGM 3-79 to 3-111 
yank command (ex) 
See ya command (ex) 


Z 


z command (DC) 
description, GEN 2-59 
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z command (edit) 
printing a screen of text, GEN 
3-12, 3-13E 
z command (ex) 
description, GEN 3-95 
z command (Mail) 
description, GEN 2-33 
- zg command (me) 
defined, GEN 5-42 
entering, GEN 5-26 
specifying fill mode, GEN 5-26 
z command (nroff/troff) 
creating overstruck characters, 
GEN 5-88 
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z command (nroff/troff) (Cont.) 
description, GEN 5-68 
z command (vi) 
defined, GEN 3-81 
positioning screen text, GEN 3-64 
z option (nroff/troff) 
defined, GEN 5-81 
Zero 
as legal line number, GEN 3-46 
ZZ command (vi) 
defined, GEN 3-80 
description, GEN 3-55 
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